I would want experiment with root structure of XML.
Make it possible to have multiple root elements. One problem is that however big an XML document is, it has to be closed at the end, so it ends up as one atomic element, you can't partially parse an XML document correctly.
Well... There is nothing stopping you partially parsing an XML documents. What you can't do is validated it. Which is the same for any other file format. You be sure the file is fully valid without fully parsing the whole file.
However, the only order you can partially parse an XML file is linear order, which clashes badly with the fact that it's a tree based format. Depending on how tree-like your schema is, this might be a massive hinderance.
This flaw isn't unique to XML. All text encoded file formats share this characteristic, and can only be parsed linearly. If you move across to the world of binary file formats, it's extremely common for them to have indexes of offsets so a parser can navigate a tree-like structure in tree order without having to fully parse it, along with other types of non-linear data structures.
> All text encoded file formats share this characteristic, and can only be parsed linearly. If you move across to the world of binary file formats, it's extremely common for them to have indexes of offsets so a parser can navigate a tree-like structure in tree order without having to fully parse it, along with other types of non-linear data structures.
You don't always have the luxury of randomly accessing a file (obvious example: shell pipelines with a producer and consumer exchanging lots of temporary data), so taking advantage of indexing might require saving a temporary file and stalling processing until the file is ready.
Personally, I'm used to parsing large XML files with event-based APIs and throwing away data aggressively, keeping in memory only one unfinished element of interest, the stack of its ancestors, and my collected data (instead of a DOM for the whole document).
Well, I see the awkwardness of emitting a log like that, but it's not any worse than emitting a single JSON array and waiting for that closing ']'.
Both can be emitted and parsed in a streaming fashion though, but I wouldn't say that either XML or JSON is suitable for logs. Maybe NDJSON, but it's more like a hack around this limitation.
Make it possible to have multiple root elements. One problem is that however big an XML document is, it has to be closed at the end, so it ends up as one atomic element, you can't partially parse an XML document correctly.