I went to a Clojure meetup one time and they all went on about how using Datomic...

Scarbutt · on July 24, 2020

Yes, and that's exactly why Nubank acquired Cognitect. They are too deep into the tech to migrate to something else, cheaper to just buy the authors.

So you have deep technical debt with serious scaling issues and bugs everywhere(Datomic/Nubank) and a burnout company(Datomic/Cognitect) get together, makes sense.

Burnout because their "Datomic Cloud" product didn't worked out, it was just a horrible complex AWS cloudformation template that force you to click through tens of aws webpages. It was more complex to manage and to dev for than on-premise but you still had all the same issues and bugs.

Nubank got into Datomic not because of Clojure, but the other way around, they got into Clojure because of Datomic. If you watch their videos, the reason they picked Datomic was because they think it had "time travel", which is quite different from having "history" of transactions, use mostly for auditing and troubleshooting, not for real time travel queries.

In the end, I guess things did work out for Cognitect, and Hickey is now laughing all the way to the bank.

I have being following Datomic for a year because of a system I inherited.

tomconnors · on July 24, 2020

This seems like an excessively uncharitable read of the situation. I've never used Nubank's software, but I have used (on-prem) Datomic and I certainly wouldn't say it has bugs everywhere. In fact, in my (admittedly low-volume and simple) usage of the system I haven't come across any bugs I can remember. Calling Cognitect a "burnout" company is inaccurate and rude.

I agree with you that the Datomic cloud stuff comes across as being frighteningly complex. I think they probably just need to work on the documentation, like making it more obvious what the differences and tradeoffs are between the deployment scenarios.

Did you inherit a Datomic system that was previously developed by a small team or a small company? Because inheriting a system that's hard to understand and change transcends languages and databases. It is the tie that binds us all as software developers.

Scarbutt · on July 24, 2020

I hit this bug: https://docs.datomic.com/on-prem/changes.html#0.9.6021

Not being able to perform writes to your database is not scary enough? it's funny how they phrased that bug.

Also hit 4-5 more that are there in the change log, let serious but still pretty bad and frustrating.

This was a internal application, the DB was not being stressed, a 4KLoC readable Clojure codebase.

Don't get me wrong I really like Datomic and its features but the implementation still has a long way to go.

tomconnors · on July 24, 2020

Fair enough, hitting that bug would have pissed me off too.

On your last point, I agree that it still has a way to go. It's good for some (many?) production use cases now, as Nubank's success demonstrates, and hopefully with Nubank's resources it'll start to live up more to its promise.

dwohnitmok · on July 24, 2020

Anecdotally I know of one company which is also in the same boat and generally regrets their usage of Datomic and is trying to move away from it last I talked with them. However, there's also people on HN like dustingetz who have had a great time with Datomic and use it as a core component of their product.

I just wish Cognitect would allow people to run public benchmarks of Datomic to make it easier to evaluate its tradeoffs.

lukashrb · on July 24, 2020

Do you have more detailed info on this? I guess this could really help making a decision and understanding the tradeoff's of using datomic.

dwohnitmok · on July 24, 2020

What the company ran into? Unfortunately not :/. It was a quick chat in an informal setting with their VP of engineering (I think?) that really was just a "huh, interesting moment" for me (although I've coded in Clojure for a full-time job before I have essentially no personal experience with Datomic).

As for the positive side, I think dustingetz monitors Clojure and Datomic threads pretty closely so maybe they can chime in here.

zeroDivisible · on July 24, 2020

What is the policy of Cognitect re: public benchmarks? I did not know that.

dwohnitmok · on July 24, 2020

> The Licensee hereby agrees, without the prior written consent of Cognitect, which may be withheld or conditioned at Cognitect’s sole discretion, it will not... publicly display or communicate the results of internal performance testing or other benchmarking or performance evaluation of the Software

From the Datomic EULA here: https://www.datomic.com/on-prem-eula.html

mercer · on July 24, 2020

That's just vile. is there any /good/ defense of this kind of agreement other than a 'think of the children' argument that people might make a mistake in their performance reviews?

fiddlerwoaroof · on July 24, 2020

It's annoying, but it's pretty standard in commercial databases: if your competitors refuse to allow public benchmarks, all it can do is hurt you.

dwohnitmok · on July 25, 2020

How standard is it? As far as I know among databases MS SQL and Oracle do this but do other commercial databases do this as well?

fiddlerwoaroof · on July 25, 2020

https://danluu.com/anon-benchmark/

It’s common enough to have a name: “DeWitt clause”. It sounds like IBM is the only major commercial rdbms vendor to allow benchmarks?

dwohnitmok · on July 25, 2020

That article only lists MS and Oracle though. Apart from IBM, I don't think CockroachDB Enterprise has such a prohibition, nor does Google Spanner (I think?), nor does Amazon Aurora (again I think?). And of course all the open source competitors don't have this clause.

Basically my impression is that DeWitt clauses are common enough to be well-known, but still in the distinct minority. That's just an impression though.

auganov · on July 24, 2020

Never had any strict trouble with it. Maybe it's just that I've used it for a long time but I enjoy the simplicity of using it.

My biggest complaint is performance for certain use-cases. Say if you're trying to pull a lot of attributes on hundreds of thousands of datoms it's going to be rather slow (even though it's supposed to be in-memory already). But again for these kinds of use-cases I'd probably go with a completely different kind of a database either way.

The story around deletions/excisions isn't that great either. Honestly the whole log/history aspect of Datomic sounds nice but never really used it other than for reverting stupid mistakes.

The #1 thing I love is the freedom of querying you get with Datomic. You insert your data in a way that makes sense for your data, and querying is pretty much a completely separate concern. For the most part you don't need to structure your schema around the querying capabilities of your database which I love. Say back in the day I liked Mongo because you could just insert whatever you wanted [0] but eventually you'd hit problems where you couldn't easily query your data (maybe it has changed over the years, no idea).

And the syntax is just a pleasure to work with. I'd love a version of Datomic that kept the same interface but dropped some of the more esoteric features in favor of performance.

Also I noticed some of the people reporting issues used the cloud version. Never used that so can't speak to that. On-prem is free and has all the features. As long as you don't redistribute it there's no problem.

[0] Yes in datomic you do have to have a schema. But it's pretty much a simple global list of possible attributes. If you need to add something later or make a change it's pretty straightforward.

huahaiy · on July 25, 2020

Datalevin may have what you like https://github.com/juji-io/datalevin, has no history, no "database as a value" etc, and focus on performance instead.

auganov · on July 25, 2020

Thanks will definitely check it out later. Though I did play around with Datascript before and found it to suffer from similar performance issues.

huahaiy · on July 26, 2020

FYI, Datalevin has faster queries than Datascript, for Datalevin has given up "database as a value" doctrine that both Datomic and Datascript share, so Datalevin can cache aggressively to achieve better performance.

invisiblerobot · on July 26, 2020

Also look into open crux

juskrey · on July 24, 2020

Datomic learning curve is relatively steep, like many higher level and more abstract things in Clojure ecosystem in general, and you should know how to cook it for sure. After figuring out all why's and how's it works like a charm.

However, I indeed find Datomic Cloud version unnecessarily complex for most applications. Probably it is still a good corporate sales product for Cognitect.. Datomic On-premise version is much more friendly for small-medium-somewhat-larger use cases. Cloud version is also an AWS thing, so locks you in there, which is also not good.

MrBuddyCasino · on July 24, 2020

I have heard multiple times that its rather slow, but haven’t seen any benchmarks. Would make sense, as a dynamically typed, garbage collected language Clojure is not the greatest fit to implement a database in. The question is, are the things you gain worth it?

dwohnitmok · on July 24, 2020

I would be surprised if Datomic's core code was written in Clojure rather than Java (and these days Java's performance can get you pretty far in implementing a database, see e.g. Cassandra).

Most highly performance-sensitive code in the Clojure ecosystem is a Clojure wrapper around a Java core.

But yes as I said elsewhere, it would be great if Cognitect allowed people to post benchmark results.

MrBuddyCasino · on July 24, 2020

I‘ve used Cassandra, its not that impressive. Much slower than the C++ rewrite (ScyllaDB?), latency issues due to GC, can’t hold a candle to Clickhouse. And they’ve been optimizing it for a long time now.

peferron · on July 24, 2020

Cassandra and ClickHouse are designed to do different things. To flip things around, have you compared the latency of a single-row update or delete in Cassandra vs ClickHouse?

hodgesrm · on July 24, 2020

Or the fact that Cassandra uses consistent hashing to distribute data automatically across hosts.

My company supports ClickHouse, but there are many use cases where it's simply not the right solution.

MrBuddyCasino · on July 25, 2020

If you care about the latency of a single row update or delete, Clickhouse is definitely the wrong tool for the job. First, it doesn’t really have deletes(afaik). Second, you need to batch updates aggressively to get good throughput.

But you’re right C* and CH are designed to do different things. I just found the difference in general performance across everything (startup, schema changes, throughput, query performance, optimization opportunities) to be quite pronounced. One feels like a race car, the other not so much.

Scarbutt · on July 24, 2020

Idiomatic Clojure is slower than JS, but you can make Clojure somewhat close to Java by writing Java with parenthesis(lots of interop from Clojure). One of the devs of Datomic brag about how it was only 200KLOC of Clojure, but if you extract the datomic tar, the lib dir has probably more that 2Million LOC of open source Java libs.

dragonne · on July 24, 2020

I wouldn't call it over-engineered, but it certainly is an operational disaster. It's slow, memory hungry, and full of catastrophic bugs.

We are currently replacing it with PostgreSQL to improve performance and scalability.

nojito · on July 24, 2020

It's ridiculously expensive and many large scale deployments have consulting arrangements because of that very reason you shared.