Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi, I work on the Gemma team (same as Alek opinions are my own).

Essentially instead of tokens that are "already there" in text, the distillation allows us to simulate training data from a larger model



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: