Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It takes ~17-20GB on Q4 depending on context length & settings (running it as we speak)

~30GB in Q8 sure, but it's a minimal gain for double the VRAM usage.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: