Hacker Newsnew | past | comments | ask | show | jobs | submit | enigma101's commentslogin

What in the flying F have we gotten ourselves into

lot's of what you say is simply not true. Maybe before sharing opinions educate oneself.


unpopular opinion -> AI slop killed open-source


first the junior then the senior and finally the ceo...


Really can't stand the image slop suffocating the internet.


they will likely die first when society collapses


here we go again


The language hurts the eyes


How so ?


Well done. I enjoyed this a lot!


what's the hardware needed to run the trillion parameter model?


It's an MoE model, so it might not be that bad. The deployment guide at https://huggingface.co/moonshotai/Kimi-K2-Thinking/blob/main... suggests that the full, unquantized model can be run at ~46 tps on a dual-CPU machine with 8× NVIDIA L20 boards.

Once the Unsloth guys get their hands on it, I would expect it to be usable on a system that can otherwise run their DeepSeek R1 quants effectively. You could keep an eye on https://old.reddit.com/r/LocalLlama for user reports.


Are such machines available in the A class clouds such as Azure/AWS/Google?


To start with, an Epyc server or Mac Studio with 512GB RAM.


I looked up the price of the Mac Studio: $9500. That's actually a lot less than I was expecting...

I'm guessing an Epyc machine is even less.


How does the mac studio load the trillion parameter model?


By using ~3 bit quantized model with llama.cpp, Unsloth makes good quants:

https://docs.unsloth.ai/models/tutorials-how-to-fine-tune-an...

Note that llama.cpp doesn't try to be production-grade engine, more focused on local usage.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: