How does this compare to protobuf, thrift, msgpack etc?
It’s roughly the same vintage as protobuf and thrift, from google and Facebook respectively, so perhaps it’s just Amazon’s equivalent, which they just never released as quick as the others did?
Obvious pros and cons, or yet another serialization format with no obvious benefits over anything else?
Just from reading their page and being familiar with the formats you mentioned:
vs. protobuf: ion is self describing, vs needing a schema
vs. thrift: similar, thrift needs a schema to interpret a binary file
both thrift and protobuf are really binary formats, though they have a canonical textual representation, it's not actually used to serialize. Sounds like ion supports serializing as text as a first class concept.
vs. msgpack: ion has a corresponding text format, whereas msgpack is only binary. Additionally, ion has a symbol type, msgpack doesn't.
I think the biggest benefit here is that it's a new chance for a format that fixes some of json's rough edges to gain critical mass. There's probably nothing ultra special about it that hasn't been solved in other formats, but maybe the timing will be right and everyone will just adopt it as a json replacement (sort of how people just gave up on xml and switch to json seemingly overnight). It's impossible to predict stuff like that.
Edit: upon noticing that it was released in 2016, it seems less likely everyone will jump on the ion bandwagon ...
If I'm not mistaken, there were plenty of text protobuf files internally used for a lot of things, and much much less anything less (okay, xml was prevalent for our team, maybe due to being java-inclined). Even seen examples of text protos pushed through the command line (it's possible, but need to get it right)
There are some painpoints that are being addressed:
1) timestamp : I have had issues with a round-tripping timestamp representation quite a bit
2) decimal : currency is denoted in decimal rather than float and shows the Amazon retail heritage. This is very useful.
3) symbols : I've had cases where symbol table/dictionary would have made big difference in serialized size
Re time stamp and decimal, probably no surprise that it is used heavily by QLDB, where having a very clear time for a change is important and a common use case is logging debits and credits as a financial ledger.
I don't know. It was common knowledge for me in college (as in it was taught as part of the curriculum) but as far as I can tell in the intervening 30+ years that knowledge seems to have been lost and relearned many times over.
cash values should be represented in fixed precision to maintain the integrity of the transaction and your book, while the prices for securities represent something different.
In securities transactions, the quantity and quote are critical. You aren’t buying securities from Plaid, right?
If you try to liquidate or resize based on the Plaid quote, your brokerage or counterparty is going to provide a totally different quote, and one from a system engineered to provide quotes aligned exactly to the market standards.
It seems much more directly comparable with CBOR/JSON as they mention it a lot https://amzn.github.io/ion-docs/guides/why.html#dual-format-... . I use CBOR quite a bit. It sounds like it doesn't really offer too much different in the binary form other than in the textual form it maintains better types than JSON and the textual version matches the binary version (where JSON / CBOR are mismatched in terms of types). So, seems nicer as a cohesive textual/binary format. I'd be interested in seeing how well packed the data is in Ion vs CBOR.
It’s roughly the same vintage as protobuf and thrift, from google and Facebook respectively, so perhaps it’s just Amazon’s equivalent, which they just never released as quick as the others did?
Obvious pros and cons, or yet another serialization format with no obvious benefits over anything else?