Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

H100s are expensive NVIDIA GPUs, each costing about $30,000. 8XH100 means you have 8 of those wired together in a big server in a data center somewhere, so around a quarter of a million dollars worth of hardware in a single box.

You need that much hardware because each H100 provides 80GB of GPU-accessible RAM, but to train this model you need to hold a LOT of model weights and training data in memory at once. 80*8 = 640GB.

~$24/hour is how much it costs to rent that machine from various providers.



Perfectly explained, thanks!


Thank you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: