Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs are really not good at following specific processes like math. They operate off vibes.

Ask Claude to multiply two ten-digit numbers. It gets the first one or two digits correct, and then makes up the rest.

ChatGPT used to have the same problem, but now it writes a program to perform the math for it.



This was true up until they started training them using Reinforcement Learning from Verifier Feedback (started with O1). By sticking a calculator in the training loop, they seem to have gotten out of the arithmetic error regime. That said, the ChatGPT default is 4o which is still susceptible to these issues.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: