Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This would be good.

I got really, really sick of XML, but one thing that XML parsers have always been good at, is realtime decoding of XML streams.

It is infuriating, waiting for a big-ass JSON file to completely download, before proceeding.

Also JSON parsers can be memory hogs (but not all of them).



Json is just a packing format that does have that limitation. If you control the source and the destination, could you possibly use a format that supports streaming better like Protobuf?


I had invented a variant of DER called DSER (Distinguished Streaming Encoding Rules), which is not compatible with DER (nor with BER) but is intended for when streaming is needed.

The type and value are encoded the same as DER, but the length is different:

- If it is constructed, the length is omitted, and a single byte with value 0x00 terminates the construction.

- If it is primitive, the value is split into segments of lengths not exceeding 255, and each segment is preceded by a single byte 1 to 255 indicating the length of that segment (in bytes); it is then terminated by a single byte with value 0x00. When it is in canonical form, the length of segments other than the last segment must be 255.

Protobuf seems to not do this unless you use the deprecated "Groups" feature, and this is only as an alternative of submessages, not for strings. In my opinion, Protobuf also seems to have many other limits and other problems, that DER (and DSER) seems to do better anyways.


I've heard two side the the Protobuf/streaming idea. On my first introduction, it seemed you could. But later reading leads me to believe it is only almost streamable: https://belkadan.com/blog/2023/12/Protobuf-Is-Almost-Streama....

* I do acknowledge you qualified the question with "better".


What stops you from parsing tokens from a stream like a SAX parser for JSON?

[ ["aaa", "bbb"], { "name", "foo" } ]

    Start array
    Start array
    String aaa
    String bbb
    End array
    Start object
    Key name
    String foo
    End object 
    End array


Nothing, really, but I don’t have the bandwidth to write JSAX. I wonder why it hasn’t already been done by someone more qualified than I am. I suspect that I’d find out, if I started doing it.

You can do that, in a specialized manner, with PHP, and Streaming JSON Parser[0]. I use that, in one of my server projects[1]. It claims to be JSON SAX, but I haven’t really done an objective comparison, and it specializes for file types. It works for my purposes.

[0] https://github.com/salsify/jsonstreamingparser

[1] https://github.com/LittleGreenViper/LGV_TZ_Lookup/blob/main/...


Streaming JSON parsers certainly exist. I'm just pointing out there's nothing about JSON that makes it inherently harder to stream than an XML tree.

In response to "Json is just a packing format that does have that [streaming] limitation".


Yes, but that also interferes with portability.

I’ve written a lot of APIs. I generally start with CSV, convert that to XML, then convert that to JSON.

CSV is extremely limited, and there’s a lot of stuff that can only be expressed in XML or JSON, but starting with CSV usually enforces a “stream-friendly” structure.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: