Hacker Newsnew | past | comments | ask | show | jobs | submit | mft_'s commentslogin

There are absolutely some (very) weird cultures/behaviours in the Japanese workplace that do set them apart from every other first-world country I've experienced (working in global organisations).

"Never nuke a country twice" for anyone who wants to know more.

It should probably be "break free from Google and Apple"?

You are right! I will change the title :)

> Mass-market newspapers may be in trouble, but that's because it turns out most people were buying those for the classifieds, not really for the news.

Genuinely interested in some sort of data on this.

My working assumption was that print news media was dying through a combination of free news availability on the internet, shifting advertising spending as a result, shifting ‘channels’ to social media, and shifting attention spans between generations.


> So how much internal memory does the latest Cerebras chip have? 44GB. This puts OpenAI in kind of an awkward position. 44GB is enough to fit a small model (~20B params at fp16, ~40B params at int8 quantization), but clearly not enough to fit GPT-5.3-Codex. That’s why they’re offering a brand new model, and why the Spark model has a bit of “small model smell” to it: it’s a smaller distil of the much larger GPT-5.3-Codex model.

This doesn't make sense.

1. Nvidia already sells e.g. the H100 with 80GB memory, so having 44GB isn't an advance, let alone a differentiator.

2. As I suspect anyone that's played with open weights models will attest, there's no way that 5.3-Codex-Spark is getting close to top-level performance and being sold in this way while being <44GB. Yes it's weaker and for sure it's probably a distil and smaller, but not by ~two orders of magnitude as suggested.


You’re mixing up HBM and SRAM - which is an understandable confusion.

NVIDIA chips use HBM (High Bandwidth Memory) which is a form of DRAM - each bit is stored using a capacitor that has to be read and refreshed.

Most chips have caches on them built out of SRAM - a feedback loop of transistors that store each bit.

The big differences are in access time, power and density: SRAM is ~100 times faster than DRAM but DRAM uses much less power per gigabyte, and DRAM chips are much smaller per gigabyte of stored data.

Most processors have a few MB of SRAM as caches. Cerebras is kind of insane in that they’ve built one massive wafer-scale chip with a comparative ocean of SRAM (44GB).

In theory that gives them a big performance advantage over HBM-based chips.

As with any chip design though, it really isn’t that simple.


So what you’re saying is that Cerebras chips offer 44GB of what is comparable to L1 caches, while NVidia is offering 80GB of what is comparable to “fast DRAM” ?

Sort of. But SRAM is not all made equal - L1 caches are small because they’re fast, and vice-versa L3 SRAM caches are slow because they’re big.

To address a large amount of SRAM requires an approximately log(N) amount of logic just to do the addressing (gross approximation). That extra logic takes time for a lookup operation to travel through, hence large = slow.

It’s also not one pool of SRAM. It’s thousands of small SRAM groups spread across the chip, with communication pathways in between.

So to have 44GB of SRAM is a very different architecture to 80GB of (unified) HBM (although even then that’s not true as most chips use multiple external memory interfaces).

HBM is high bandwidth. Whether that’s “fast” or not depends on the trade off between bandwidth and latency.

So, what I’m saying is this is way more complicated than it seems. But overall, yeah, Cerebras’ technical strategy is “big SRAM means more fast”, and they’ve not yet proven whether that’s technically true nor whether it makes economic sense.


Right. L3 caches, i.e. SRAMs of tens of MB or greater sizes have a latency that is only 2 to 3 times better than DRAM. SRAMs of only a few MB, like most L2 caches, may have a latency 10 times less than DRAM. L1 caches, of around 64 kB, may have a latency 3 to 5 times better than L2 caches.

The throughput of caches becomes much greater than of DRAM only when they are separated, i.e. each core has its private L1+L2 cache memory, so the transfers between cores and private caches can be done concurrently, without interference between them.

When an SRAM cache memory is shared, the throughput remains similar to that of external DRAM.

If the Cerebras memory is partitioned in many small blocks, then it would have low latency and high aggregate throughput for data that can be found in the local memory block, but high latency and low throughput for data that must be fetched from far away.

On the other hand, if there are fewer bigger memory blocks, the best case latency and throughput would be worse, but the worst case would not be so bad.


> L1 caches are small because they’re fast

I guess you meant to say they are fast because they are small?


Thanks, TIL.

It does make sense. Nvidia chips do not promise 1,000+ tokens/s. The 80GB is external HBM, unlike Cerebras’ 44GB internal SRAM.

The whole reason Cerebras can inference a model thousands of tokens per second is because it hosts the entire model in SRAM.

There are two possible scenarios for Codex Spark:

1. OpenAI designed a model to fit exactly 44GB.

2. OpenAI designed a model that require Cerebras to chain multiple wafer chips together; IE, an 88GB or 132GB or 176GB model or more.

Both options require the entire model to fit inside SRAM.


Let's not forget the KV-cache which needs a lot of RAM too (although not as much as the model weights), and scales up linearly with sequence length.

Did you upgrade your iOS? I’m stubbornly sticking to v18 as long as I can and have noticed no such change. Cooking timers, alarms, and light settings are basically all I use Siri for.

Calorie restriction consistently extends the lifespan of laboratory animals, so…

As a European, I’ve been gently looking forward to Rivian’s R3 for years now. I like the design and it looks much more like a machine that will suit Europe.

Related, Colin Furze experimented with using wood gas to run an IC engine, somewhat successfully: https://youtu.be/FK2qK-NCQH8

This is always the piece that disappoints me when seeing this and other similar tools.

Surely it is an obvious next step to offer export to e.g. React, React Native, SwiftUI…?

Otherwise you spend days, weeks, months crafting your perfect design down to the pixel, and then someone else has to start again from scratch with a totally different approach. Maybe I’m missing something, but that feels incredibly inefficient and regressive.


Wouldn't that be more of a RAD tool, like Lazarus[0]? Or are you suggesting you could do both in the same tool? I'm not doubting it's possible, but those are two very different (and large!) products from a functional standpoint. Combining them is going to be quite the undertaking.

[0]: https://www.lazarus-ide.org/


> your perfect design down to the pixel

> Surely it is an obvious next step to offer export to e.g. React, React Native, SwiftUI…?

These UI frameworks do not really operate in a "down to the pixel" way and so getting correspondance between a bitmap design and a representation in the UI framework is far from an "obvious next step" (if it were trivial to add such a feature, then of course the developers of these tools would add it).

Various concerns that aren't captured in the bitmap design - like how your screens transition from one to another, etc - can dramatically affect how the UI is implemented in a target framework. This is the job of UI engineers.

Well, used to be. Now it's vibe code all the way down.


Sure, I understand the going from a raster design image to a working prototype in one of the frameworks is not easily automated.

However, I would’ve thought it would be feasible to create a design tool with this in mind, so the same fundamental design structure could be output either to the internal preview, or (any?) one of the target frameworks.


I have no reason to defend BMGF and enjoy a good comeuppance probably more than the next person, but the article you linked to about the issues in India is far from the smoking gun in the hands BMGF you seem to think it is.

From the article: an already-approved vaccine (by FDA and others) was given to children via a trial run by an NGO (PATH) and was funded by BMGF. The trial was apparently run unethically, and in addition a year or so later it was found that girls administered the vaccine had possibly experienced adverse events, some very serious.

(Based on the article alone) it’s very likely that BMGF would have been totally hands off in overseeing the trial, and would certainly have had strict agreements with PATH. If there were indeed ethical breaches, I’m sure BMGF was very unhappy about this. Moreover, while we of course shouldn’t ignore the safety findings, attributing events causally to the vaccination against the standard background rate of events in a particular population is rife with uncertainty.

And of course, the trial potentially being unethically run doesn’t make the (already- and still-approved) vaccine more dangerous… but does make it easier to whip up sensation and clicks for articles, especially if there’s a big rich US Foundation also tangentially involved.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: