Why would you use an LLM to factorize numbers?

bgwalter · 2025-07-29T16:12:12 1753805532

Because we are told that they can solve IMO problems. Yet they fail at basic math problems, not only at factorization but also when probing them with relatively basic symbolic math that would not require the invocation of an external program.

Also, you know it they fail they could say so instead of giving a hallucinated answer. First the models lie and say that a 20 digit number takes vast amounts of computing. Then, if pointed to a factorization program they pretend to execute it and lie about the output.

There is no intelligence or flexibility apart from stealing other people's open source code.

simonw · 2025-07-29T16:21:08 1753806068

That's why the IMO results were so notable: that was one of those moments where new models were demonstrated doing something that they had previously been unable to do.

ducktective · 2025-07-29T16:47:43 1753807663

I can't fathom why more people aren't talking about the IMO story. Apparently the model they used is not just an LLM but some RL are involved too. If a model wins gold at IMO, is it still merely a "statistical parrot"?

sejje · 2025-07-29T17:22:43 1753809763

Stochastic parrot is the term.

I don't think it's ever been accurate.

bgwalter · 2025-07-29T20:58:21 1753822701

The results were private and the methodology was not revealed. Even Tao, who was bullish on "AI", is starting to question the process.

simonw · 2025-07-29T21:29:38 1753824578

The same thing has also been achieved by a Google DeepMind team and at least one group of independent researchers using publicly available models and careful promoting tricks.