More

amgreg · 2025-08-18T14:03:24 1755525804

The author makes no effort to explain why AI :isn’t: a commodity as Apple and Amazon says. I was looking forward to that. I think the article is weak for not defending its premise. Everything else is fluff.

bloggie · 2025-08-18T14:09:11 1755526151

I agree - and if the article is correct and Apple and Amazon are the losers, I fail to glean who the winners will be or how their business model will be different.

gmays · 2025-08-18T14:49:55 1755528595

That's fair, but it wasn't the point of the article because it's messy. Many would argue that core LLMs are 'trending' toward commodity, and I'd agree.

But it's complicated because commodities don't carry brand weight, yet there's obviously a brand power law. I (like most other people) use ChatGPT. But for coding I use Claude and a bit of Gemini, etc. depending on the problem. If they were complete commodities, it wouldn't matter much what I used.

A part of the issue here is that while LLMs may be trending toward commodity, "AI" isn't. As more people use AI, they get locked into their habits, memory (customization), ecosystem, etc. And as AI improves if everything I do has less and less to do with the hardware and I care more about everything else, then the hardware (e.g. iPhone) becomes the commodity.

Similar with AWS if data/workflow/memory/lock-in becomes the moat I'll want everything where the rest of my infra is.

amgreg · 2025-05-16T22:26:21 1747434381

I think you are conflating the Closure Library with the Closure Compiler. They are related but not identical. The Compiler, I think, is what makes it difficult to use externs; its “advanced optimizations” can and often does break libraries that weren’t written with the Compiler’s quirks in mind. But advanced optimizations is an option; if you don’t need aggressive minification, function body inlining, etc. you can opt out.

Shadow CLJS has made working with external libraries quite easy and IIRC it lets you set the compilation options for your libraries declaratively.

john2x · 2025-05-16T22:34:17 1747434857

Ahh right. Yes I am in fact conflating the two.

But can the compiler be used without the library? Or can the library be used without the compiler/would it still be beneficial?

amgreg · 2025-05-16T22:42:55 1747435375

Yes and yes; in the past, prior to ECMAScript providing first-class inheritance, module ex/imports etc, the Library supplied methods to achieve these in development, and the Compiler would identify these cases and perform the appropriate prototype chaining, bundling, etc. See, eg, goog.provide

For the most part, I would guess people still use the Closure Compiler because of its aggressive minification or for legacy reasons. I think both are probably true for ClojureScript, as well as the fact that the Compiler is Java-based so it has a Java API that (I am guessing here) made it easier to bootstrap on top of the JVM Clojure tooling / prior art.

yladiz · 2025-05-17T18:04:17 1747505057

I've been doing frontend development for over 10 years, and obviously it's anecdotal but I never heard anyone use the Closure Compiler outside of ClojureScript, and I imagine that in practice most people doing frontend development are using Webpack, Vite, Parcel, etc. The idea of really small bundles sounds nice, but in practice because the advanced optimizations require manual tweaking in many cases to get it to work (externs) it's something few people would want to deal with and the small bundle size improvement isn't worth it compared to the standard tools like UglifyJS/Terser.

There may be other reasons, but I assume the main reason that the Closure Compiler was chosen for ClojureScript was because it's Java based, so it was straightforward to get working. Moving away from it now would be a huge breaking change, so it's unlikely to happen in the official compiler anytime soon or ever. I think the only way it would actually happen is if an alternative like Cherry got enough traction and people moved to using mainly the alternative.

amgreg · 2025-05-17T18:13:35 1747505615

Yeah nowadays I think non-ClojureScript people use it mostly for legacy reasons or the aggressive minification. Back in the day, aside from the pre-ES5 conveniences I mentioned surrounding inheritance and module bundling, it was also a way for developers to do some basic type enforcement (via JSDoc annotations that the Compiler would check). TypeScript essentially rendered that obsolete.

See: https://effectivetypescript.com/2023/09/27/closure-compiler/

amgreg · 2025-03-20T17:05:18 1742490318

Why would one choose MS SQL nowadays? I am curious, why did the author choose it?

wvbdmp · 2025-03-20T22:42:27 1742510547

* Express is free and will take you a very long way.

* SSMS is great.

* T-SQL is great.

* Integration with .Net is great.

* It’s cross platform (I’ve only ever done Windows though).

* Windows auth is pretty sweet, no passwords in your configs/repos.

* It Just Works™ for real. You can have multiple instances on the same system, different versions and editions, and never worry about anything. Backup and restore are a breeze. Installation, uninstall, updates and upgrades are a breeze. Everything is a breeze. It’s unbelievable how little you need to worry about MSSQL instances.

DoctorOW · 2025-03-21T02:26:44 1742524004

> It’s cross platform (I’ve only ever done Windows though).

I've tried it on Linux and simply couldn't get it working. The Microsoft package manager repos are out of date or contain buggier versions of the software. I wanted all the other benefits you've listed, but ultimately Postgres has been easier for me.

pjmlp · 2025-03-20T18:26:20 1742495180

The tooling, the JIT compiler, having the CLR available in the DB engine, enterprise features like OLAP, failover, cluster management, distributed transactions, packaged DB apps, integration with Active Directory, for starters.

Similar feature offerings like Oracle, DB2, and co.

spapas82 · 2025-03-20T18:55:03 1742496903

I'd like to answer for myself (I'm the one that opened the issue and reposted here for some show-and-shame in case MS reconsiders and starts supporting the project):

We have a 10+ years old desktop project (.exe) in C# that uses MS SQL Server as a Database and we need to change it to be a proper web-app. We are heavy Django users and now we have stumbled upon a wall. Unfortunately because of the complexity of the project it's not feasible to change the DB.

stefanos82 · 2025-03-20T18:04:06 1742493846

Many companies use CRM / ERP in-house stocktaking that depend on MSSQL exclusively.

electroly · 2025-03-20T19:11:27 1742497887

It is an excellent SQL engine if money is no object, but you better be sure that second part is true.

jiggawatts · 2025-03-20T20:06:47 1742501207

That is basically never true any more, even in large government and large enterprise.

Microsoft has dialled up the pricing to match Oracle, which means that now everyone has to be so frugal with cores assigned to their DB servers that any software performance benefits are simply lost. Cheaper or open source database engines can be assigned 10x or even 100x the compute capacity at the same cost.

One “trick” Microsoft pulled was to quietly change per-core licensing to per-vCPU (hyper-thread) if you use SQL in the cloud. This means that it costs 2x as much as it used to on-prem.

Then they have the nerve to publish marketing about how you can “save money” by migrating to Azure.

Narrator: You can’t.

electroly · 2025-03-20T20:52:12 1742503932

Re: vCPUs, the newest generation of AMD in AWS is 1 vCPU = 1 core, no SMT, so try to choose that generation if you have to run in AWS.

jiggawatts · 2025-03-20T23:09:58 1742512198

In Microsoft Azure the HT-off feature has had a bunch of previews that all quietly disappeared without ever becoming generally available. I'm guessing management noticed that this capability would eat into Microsoft SQL Server (and Windows Server) licensing revenue.

Similarly, I've noticed that all of the managed Azure SQL products lag behind on the latest CPU generations by many years. "You can just scale up at your expense and our profit!" is the response when you read about this in the forums.

staticautomatic · 2025-03-20T17:10:40 1742490640

Can happen if you work for a Microsoft shop and it's the only DB IT will OK.

olavgg · 2025-03-20T17:45:42 1742492742

In those cases I tell them that I store everything in a file(sqlite) and IT can easily backup that file. If IT needs data access, its available in the application with csv/spreadsheet export.

I promise you, they will be super happy with that!

jiggawatts · 2025-03-20T20:09:23 1742501363

Riiiight up to the point that the database needs to be online and running backups at the same time.

When that occurs, your “simple” tech choice suddenly becomes a too simple straitjacket.

olavgg · 2025-03-20T22:59:25 1742511565

SQLite supports online backup https://www.sqlite.org/backup.html

You can also use Litestream to create snaphots.

staticautomatic · 2025-03-20T18:01:31 1742493691

I worked at a place where the head of IT/de facto CTO was well aware that SQLite is a db and insisted that if I needed a db it had to be MSSQL.

olavgg · 2025-03-20T19:24:09 1742498649

But you are not supposed to tell them that you use another SQL db, you use a file as it simplies things and saves money. For example, you do not need to expose anything over the network, you do not need to setup service account and password and data access is embedded in your application which improves latency. And backup is a lot easier as you just create a daily dump from your application that writes to a backup folder and tell IT to backup that folder. People have been saving things to files for decades, and IT shouldn't worry about the data structure in that file.

This is not a lie, its about avoiding politics and fights. If they ask you to use MSSQL instead of a file, you politely ask them; why they want to overengineer and delay application development.

I've been doing this for years. It works.

pestatije · 2025-03-20T17:22:32 1742491352

skills...its not the db but the people who work with it

amgreg · 2025-03-02T18:22:51 1740939771

If you’re on a Mac or iOS you could try creating a Shortcut where you input Markdown, convert to rich text, then output as a PDF. I use Shortcuts regularly. It’s pretty easy to set up. I haven’t tried it on something as larger as 500 pages, though. YMMV

SamCoding · 2025-03-02T19:44:01 1740944641

I'm on Windows (with WSL) so unfortunately I can't.

amgreg · on Jan 28, 2025

> NVIDIA is the only viable seller of shovels for this gold rush for everyone but Google and Anthropic.

Why do you except Google and Anthropic?

ashoeafoot · on Jan 28, 2025

Google makes its own hardware, they are vertical integrated .Dont know about Antrophic

ein0p · on Jan 28, 2025

Anthropic uses a ton of TPU in addition to GPU, so presumably has the expertise to use both, and shift workloads as needed. Note that large scale TPU pretty much means Jax and not just "platform independent" flavor of Jax but Jax with TPU-specific optimizations.

mike_hearn · on Jan 28, 2025

Anthropic are the only (?) heavy users of Amazon's chips. Or maybe they aren't heavy users. It's hard to say, they use NVIDIA too. Amazon is a big investor.

ein0p · on Jan 28, 2025

Amazon's chips at this point are marketing for Amazon. I've seen the benchmarks, they're not quite ready for serious use yet. I suspect Anthropic got a good discount on GPUs in return for using Amazon's own chips in any possible capacity (or maybe just for the press release claiming such use). The only real alternative to NVIDIA on the inference side that you can actually buy hardware for is Intel Gaudi which costs less and performs rather well, but everyone seems to have written it off, along with Intel itself, and it's not available in any cloud last I checked. On the training side there's really no alternative at all - PyTorch is the de-facto standard, and while there is PyTorch XLA, it's even less popular than Jax, which is already like 20x less popular than PyTorch. Bottom line: capable Jax engineers able to optimize distributed Jax programs on TPUs are unobtainable unicorns for anyone but the top labs and Google itself. Note that the training side has significantly different requirements than inference side. Inference side is much simpler to optimize and wring the performance out of.

mike_hearn · on Jan 28, 2025

Yes I've been expecting AMD to eventually get inference working because it's so much simpler. Supposedly Meta do use some AMD for inference. It's sad that you can implement llama inference on the CPU in a few thousand lines of Java yet somehow AMD isn't cleaning up there.

amgreg · on Jan 19, 2025

I think the OP is posting this in the context of the other front-page discussion of the Bluesky protocol. I think in this context it is interesting.

https://news.ycombinator.com/item?id=42752703

woopwoop · on Jan 19, 2025

I don't fault op for posting it. I agree that it's an interesting historical artifact, but intrinsically the essay is dumb.

amgreg · on Oct 28, 2024

“Decline I” is an instruction for the student to provide the first person pronoun in all cases: I (nominative), me (accusative/dative/ablative), my (genitive), mine (genitive substantive). (I have borrowed the case names from Latin, with which I am more familiar. I think the English cases are nominative, objective, possessive.)

I believe the misspellings in the spelling section are intentional so that the student will identify them—I am guessing that’s the point.

edflsafoiewq · on Oct 28, 2024

"eneeavor" is I think the only misspelling in the spelling section, so I don't think it is intentional. Perhaps the test was read aloud.

amgreg · on June 29, 2024

This case is about whose interpretation gets to fill in the gaps.

The statute (APA) requires courts to form an independent judgment about the gaps.

The Chevron doctrine required courts in certain cases to set this judgment aside in favor of an agency’s judgment—-basically on the basis that the agencies are closer to the problems and know better.

This setting aside may be the better outcome, however it is not explicitly specified in the statute (APA).

Ultimately, if Congress wants this to be the case, they /can/ amend the statute (APA), effectively enshrining the Chevron doctrine.

At the end of the day, the court’s decision here rests on statutory interpretation (not constitutional doctrine) so Congress could change the outcome by amending the statute (APA) to explicitly codify Chevron. This would be achieved with its ordinary legislative power (Article 1 Section 7 of the Constitution).

The court’s decision does effectively put the ball back in Congress’ court.

amgreg · on May 15, 2024

It struck me that Jepsen has identified clear situations leading to invariant violations but Datomic’s approach seems to have been purely to clarify their documentation. Does this essentially mean the Datomic team accepts that the violations will happen, but don’t care?

From the article:

> From Datomic’s point of view, the grant workload’s invariant violation is a matter of user error. Transaction functions do not execute atomically in sequence. Checking that a precondition holds in a transaction function is unsafe when some other operation in the transaction could invalidate that precondition!

stuarthalloway · on May 15, 2024

As Jepsen confirmed, Datomic’s mechanisms for enforcing invariants work as designed. What does this mean practically for users? Consider the following transactional pseudo-data:

[

[Stu favorite-number 41]

;; maybe more stuff

[Stu favorite-number 42]

]

An operational reading of this data would be that early in the transaction I liked 41, and that later in the transaction I liked 42. Observers after the end of the transaction would hopefully see only that I liked 42, and we would have to worry about the conditions under which observers might see that 41.

This operational reading of intra-transaction semantics is typical of many databases, but it presumes the existence of multiple time points inside a transaction, which Datomic neither has nor wants — we quite like not worrying about what happened “in the middle of” a transaction. All facts in a transaction take place at the same point in time, so in Datomic this transaction states that I started liking both numbers simultaneously.

If you incorrectly read Datomic transactions as composed of multiple operations, you can of course find all kinds of “invariant anomalies”. Conversely, you can find “invariant anomalies” in SQL by incorrectly imposing Datomic’s model on SQL transactions. Such potential misreadings emphasize the need for good documentation. To that end, we have worked with Jepsen to enhance our documentation [1], tightening up casual language in the hopes of preventing misconceptions. We also added a tech note [2] addressing this particular misconception directly.

[1] https://docs.datomic.com/transactions/transactions.html#tran...

[2] https://docs.datomic.com/tech-notes/comparison-with-updating...

aphyr · on May 15, 2024

To build on this, Datomic includes a pre-commit conflict check that would prevent this particular example from committing at all: it detects that there are two incompatible assertions for the same entity/attribute pair, and rejects the transaction. We think this conflict check likely prevents many users from actually hitting this issue in production.

The issue we discuss in the report only occurs when the transaction expands to non-conflicting datoms--for instance:

[Stu favorite-number 41]

[Stu hates-all-numbers-and-has-no-favorite true]

These entity/attribute pairs are disjoint, so the conflict checker allows the transaction to commit, producing a record which is in a logically inconsistent state!

On the documentation front--Datomic users could be forgiven for thinking of the elements of transactions as "operations", since Datomic's docs called them both "operations" and "statements". ;-)

stuarthalloway · on May 15, 2024

Mea culpa on the docs, mea culpa. Better now [1].

In order for user code to impose invariants over the entire transaction, it must have access to the entire transaction. Entity predicates have such access (they are passed the after db, which includes the pending transaction and all other transactions to boot). Transaction functions are unsuitable, as they have access only to the before db. [2]

Use entity predicates for arbitrary functional validations of the entire transaction.

[1] https://docs.datomic.com/transactions/transactions.html#tran...

[2] https://docs.datomic.com/transactions/transaction-functions....

lgrapenthin · on May 15, 2024

Somewhat unrelated ad docs: It appears that "Query" opens a deadlink

JB024066 · on May 16, 2024

Thanks for the report! just fixed the link.

Voultapher · on May 15, 2024

The man the myth the legend himself. I haven't ceased to be awed by how often the relevant person shows up in the HN comment section.

Loved your talks.

puredanger · on May 15, 2024

Datomic transactions are not “operations to perform”, they are a set of novel facts to incorporate at a point in time.

Just like a git commit describes a set of modifications, do you or should you want to care about which order or how the adds, updates, and deletes occur in a single git commit? OMG no, that sounds awful.

The really unusual thing is that developers expect intra-transaction ordering to be a thing they accept from any other database. OMG, that sounds awful, how do you live like that.

cdchn · on May 16, 2024

Do developers not expect intra-transaction ordering from within a transaction?

kccqzy · on May 16, 2024

It depends on the previous experience of said developers, and such expectation varies widely.

voganmother42 · on May 15, 2024

Nested transactions or savepoints also exist in other systems

aphyr · on May 15, 2024

Yeah, this basically boils down to "a potential pitfall, but consistent with documentation, and working as designed". Whether this actually matters depends on whether users are writing transaction functions which are intended to preserve some invariant, but would only do so if executed sequentially, rather than concurrently.

Datomic's position (and Datomic, please chime in here!) is that users simply do not write transaction functions like this very often. This is defensible: the docs did explicitly state that transaction functions observe the start-of-transaction state, not one another! On the other hand, there was also language in the docs that suggested transaction functions could be used to preserve invariants: "[txn fns] can atomically analyze and transform database values. You can use them to ensure atomic read-modify-update processing, and integrity constraints...". That language, combined with the fact that basically every other Serializable DB uses sequential intra-transaction semantics, is why I devoted so much attention to this issue in the report.

It's a complex question and I don't have a clear-cut answer! I'd love to hear what the general DB community and Datomic users in particular make of these semantics.

nickpeterson · on May 15, 2024

I feel like “enough rope to shoot yourself” is kind of baked into any high power, low ceremony tool.

stuarthalloway · on May 15, 2024

As a proponent of just such tools I would say also that "enough rope to shoot(?) yourself" is inherent in tools powerful enough to get anything done, and is not a tradeoff encountered only when reaching for high power or low ceremony.

nickpeterson · on May 16, 2024

I always loved the broken phrase because it implies something really went terribly wrong ;)

refset · on May 15, 2024

I don't know whether it was intentional or not, but IIRC DataScript opted for sequential intra-transaction semantics instead.

stuarthalloway · on May 16, 2024

It is worth noting here that Datomic's intra-transaction semantics are not a decision made in isolation, they emerge naturally from the information model.

Everything in a Datomic transaction happens atomically at a single point in time. Datomic transactions are totally ordered, and this ordering is visible via the time t shared by every datom in the transaction. These properties vastly simplify reasoning about time.

With this information model intermediate database states are inexpressible. Intermediate states cannot all have the same t, because they did not happen at the same time. And they cannot have different ts, as they are part the same transaction.

refset · on May 16, 2024

Thank you for the explanations. Do you happen to know why transactions ("transaction requests") are represented as lists and not sets?

stuarthalloway · on May 16, 2024

When we designed Datomic (circa 2010), we were concerned that many languages had better support for lists than for sets, in particular list literals and no set literals.

Clojure of course had set literals from the beginning...

richhickey · on May 16, 2024

An advantage of using lists is that tx data tends to be built up serially in code. Having to look at your tx data in a different (set) order would make proofreading alongside the code more difficult.

huahaiy · on May 16, 2024

Correct. I don't know about DataScript's intention, but it is intentional for Datalevin, as we have tests for sequential intra-transaction semantics.

tonsky · on May 23, 2024

The idea was that it allows you to add something first and then build on top of it, all in the same transaction

aaroniba · on May 16, 2024

Yes. Perhaps this is a performance choice for DataScript since DataScript does not keep a complete transaction history the way Datomic does? I would guess this helps DataScript process transactions faster. There is a github issue about it here: https://github.com/tonsky/datascript/issues/366

aaroniba · on May 16, 2024

I think the article answers your question at the end of section 3.1:

> "This behavior may be surprising, but it is generally consistent with Datomic’s documentation. Nubank does not intend to alter this behavior, and we do not consider it a bug."

When you say, "situations leading to invariant violations" -- that sounds like some kind of bug in Datomic, which this is not. One just has to understand how datomic processes transactions, and code accordingly.

I am unaffiliated with Nubank, but in my experience using Datomic as a general-purpose database, I have not encountered a situation where this was a problem.

aphyr · on May 16, 2024

This is good to hear! Nubank has also argued that in their extensive use of Datomic, this kind of issue doesn't really show up. They suggest custom transaction functions are infrequently written, not often composed, and don't usually perform the kind of precondition validation that would lead to this sort of mistake.

aaroniba · on May 16, 2024

Yeah, I've used a transaction functions a few times but never had a case where two transaction functions within the same d/transaction ever interacted with each other. If I did encounter that case, I would probably just write one new transaction function to handle it.

SoftTalker · on May 15, 2024

Sounds similar to the need to know that in some relational databases, you need to SELECT ... FOR UPDATE if you intend to perform an update that depends on the values you just selected.

amgreg · on May 7, 2024

> things get complicated with virtual threads, they shouldn't be pooled, as they aren't a scarce resource

Why not pool virtual threads, though? I get that they’re not scarce, but if you’re looking to limit throughput anyway wouldn’t that be easier to achieve using a thread pool than semaphores?

andersmurphy · on May 7, 2024

(author here) From what I've read, other than documentation saying they shouldn't be pooled, is that by disign they are meant to run and then get garbage collected. There's also some overhead in managing the pool. If someone has a deeper understanding of virtual threads I'd love to know why in more detail.

As to why use a semaphore over a thread pool for this implementation? A thread pool couples throughput to the number of running threads. A semaphore lets me couple throughput to started tasks per second. I don't care how many threads are currently running, I care about how many requests I'm making per second. Does that make more sense?

pron · on May 8, 2024

Pooling virtual threads has no upside and potentially a bit of downside: 1. You hang on for unused objects for longer instead of returning them to the more general pool that is the GC; 2. You risk leaking context between multiple tasks sharing the thread which may have security implications. Because of these and similar downsides you should only ever pool objects that give you benefit when they're shared -- e.g. they're expensive to create -- and shouldn't pool objects otherwise.

amgreg · on May 9, 2024

Thank you for this, but thank you especially for virtual threads! They are awesome!

Is point 2 a virtual-thread only risk, or would we incur it with regular threads too?

pron · on May 9, 2024

Thank you! You incur this risk when pooling any kind of thread, too, but with platform threads at least pooling makes sense because they're costly, so you just need to be careful with thread locals on a shared thread pool. Not needing to share threads and potentially leak context is a security advantage of virtual threads.

andersmurphy · on May 8, 2024

Thank you for the explanation and for all the hard work on bringing virtual threads to the JVM. It's a really awesome feature.

cyco130 · on May 7, 2024

Aren't "virtual threads" built on a thread pool themselves? I suppose there would be no advantage in pooling an already pooled resource since presumably the runtime would manage pooling better than user code.