I went to a Clojure meetup one time and they all went on about how using Datomic in production is a nightmare and it's generally an over-engineered product that isn't worth the trouble in the end. Do most people who have dealt with Datomic in production feel this way?
Yes, and that's exactly why Nubank acquired Cognitect. They are too deep into the tech to migrate to something else, cheaper to just buy the authors.
So you have deep technical debt with serious scaling issues and bugs everywhere(Datomic/Nubank) and a burnout company(Datomic/Cognitect) get together, makes sense.
Burnout because their "Datomic Cloud" product didn't worked out, it was just a horrible complex AWS cloudformation template that force you to click through tens of aws webpages. It was more complex to manage and to dev for than on-premise but you still had all the same issues and bugs.
Nubank got into Datomic not because of Clojure, but the other way around, they got into Clojure because of Datomic. If you watch their videos, the reason they picked Datomic was because they think it had "time travel", which is quite different from having "history" of transactions, use mostly for auditing and troubleshooting, not for real time travel queries.
In the end, I guess things did work out for Cognitect, and Hickey is now laughing all the way to the bank.
I have being following Datomic for a year because of a system I inherited.
This seems like an excessively uncharitable read of the situation. I've never used Nubank's software, but I have used (on-prem) Datomic and I certainly wouldn't say it has bugs everywhere. In fact, in my (admittedly low-volume and simple) usage of the system I haven't come across any bugs I can remember. Calling Cognitect a "burnout" company is inaccurate and rude.
I agree with you that the Datomic cloud stuff comes across as being frighteningly complex. I think they probably just need to work on the documentation, like making it more obvious what the differences and tradeoffs are between the deployment scenarios.
Did you inherit a Datomic system that was previously developed by a small team or a small company? Because inheriting a system that's hard to understand and change transcends languages and databases. It is the tie that binds us all as software developers.
Fair enough, hitting that bug would have pissed me off too.
On your last point, I agree that it still has a way to go. It's good for some (many?) production use cases now, as Nubank's success demonstrates, and hopefully with Nubank's resources it'll start to live up more to its promise.
Anecdotally I know of one company which is also in the same boat and generally regrets their usage of Datomic and is trying to move away from it last I talked with them. However, there's also people on HN like dustingetz who have had a great time with Datomic and use it as a core component of their product.
I just wish Cognitect would allow people to run public benchmarks of Datomic to make it easier to evaluate its tradeoffs.
What the company ran into? Unfortunately not :/. It was a quick chat in an informal setting with their VP of engineering (I think?) that really was just a "huh, interesting moment" for me (although I've coded in Clojure for a full-time job before I have essentially no personal experience with Datomic).
As for the positive side, I think dustingetz monitors Clojure and Datomic threads pretty closely so maybe they can chime in here.
> The Licensee hereby agrees, without the prior written consent of Cognitect, which may be withheld or conditioned at Cognitect’s sole discretion, it will not... publicly display or communicate the results of internal performance testing or other benchmarking or performance evaluation of the Software
That's just vile. is there any /good/ defense of this kind of agreement other than a 'think of the children' argument that people might make a mistake in their performance reviews?
That article only lists MS and Oracle though. Apart from IBM, I don't think CockroachDB Enterprise has such a prohibition, nor does Google Spanner (I think?), nor does Amazon Aurora (again I think?). And of course all the open source competitors don't have this clause.
Basically my impression is that DeWitt clauses are common enough to be well-known, but still in the distinct minority. That's just an impression though.
Never had any strict trouble with it. Maybe it's just that I've used it for a long time but I enjoy the simplicity of using it.
My biggest complaint is performance for certain use-cases. Say if you're trying to pull a lot of attributes on hundreds of thousands of datoms it's going to be rather slow (even though it's supposed to be in-memory already). But again for these kinds of use-cases I'd probably go with a completely different kind of a database either way.
The story around deletions/excisions isn't that great either. Honestly the whole log/history aspect of Datomic sounds nice but never really used it other than for reverting stupid mistakes.
The #1 thing I love is the freedom of querying you get with Datomic. You insert your data in a way that makes sense for your data, and querying is pretty much a completely separate concern. For the most part you don't need to structure your schema around the querying capabilities of your database which I love. Say back in the day I liked Mongo because you could just insert whatever you wanted [0] but eventually you'd hit problems where you couldn't easily query your data (maybe it has changed over the years, no idea).
And the syntax is just a pleasure to work with. I'd love a version of Datomic that kept the same interface but dropped some of the more esoteric features in favor of performance.
Also I noticed some of the people reporting issues used the cloud version. Never used that so can't speak to that. On-prem is free and has all the features. As long as you don't redistribute it there's no problem.
[0] Yes in datomic you do have to have a schema. But it's pretty much a simple global list of possible attributes. If you need to add something later or make a change it's pretty straightforward.
FYI, Datalevin has faster queries than Datascript, for Datalevin has given up "database as a value" doctrine that both Datomic and Datascript share, so Datalevin can cache aggressively to achieve better performance.
Datomic learning curve is relatively steep, like many higher level and more abstract things in Clojure ecosystem in general, and you should know how to cook it for sure.
After figuring out all why's and how's it works like a charm.
However, I indeed find Datomic Cloud version unnecessarily complex for most applications. Probably it is still a good corporate sales product for Cognitect.. Datomic On-premise version is much more friendly for small-medium-somewhat-larger use cases. Cloud version is also an AWS thing, so locks you in there, which is also not good.
I have heard multiple times that its rather slow, but haven’t seen any benchmarks. Would make sense, as a dynamically typed, garbage collected language Clojure is not the greatest fit to implement a database in.
The question is, are the things you gain worth it?
I would be surprised if Datomic's core code was written in Clojure rather than Java (and these days Java's performance can get you pretty far in implementing a database, see e.g. Cassandra).
Most highly performance-sensitive code in the Clojure ecosystem is a Clojure wrapper around a Java core.
But yes as I said elsewhere, it would be great if Cognitect allowed people to post benchmark results.
I‘ve used Cassandra, its not that impressive. Much slower than the C++ rewrite (ScyllaDB?), latency issues due to GC, can’t hold a candle to Clickhouse. And they’ve been optimizing it for a long time now.
Cassandra and ClickHouse are designed to do different things. To flip things around, have you compared the latency of a single-row update or delete in Cassandra vs ClickHouse?
If you care about the latency of a single row update or delete, Clickhouse is definitely the wrong tool for the job. First, it doesn’t really have deletes(afaik). Second, you need to batch updates aggressively to get good throughput.
But you’re right C* and CH are designed to do different things. I just found the difference in general performance across everything (startup, schema changes, throughput, query performance, optimization opportunities) to be quite pronounced. One feels like a race car, the other not so much.
Idiomatic Clojure is slower than JS, but you can make Clojure somewhat close to Java by writing Java with parenthesis(lots of interop from Clojure). One of the devs of Datomic brag about how it was only 200KLOC of Clojure, but if you extract the datomic tar, the lib dir has probably more that 2Million LOC of open source Java libs.