> As a result, once o1 becomes generally available, we will likely notice the persistent hallucinations and faulty reasoning, especially when the problem is sufficiently new or complex, beyond the “reasoning programs” or “reasoning patterns” the model learned during the reinforcement learning phase.
I had been using 4o as a rubber ducky for some projects recently. Since I appeared to have access to o1-preview, I decided to go back and redo some of those conversations with o1-preview.
I think your comment is spot on. It's definitely an advancement, but still makes some pretty clear mistakes and does some fairly faulty reasoning. It especially seems to have a hard time with causal ordering, and reasoning about dependencies in a distributed system. Frequently it gets the relationships backwards, leading to hilarious code examples.
I had been using 4o as a rubber ducky for some projects recently. Since I appeared to have access to o1-preview, I decided to go back and redo some of those conversations with o1-preview.
I think your comment is spot on. It's definitely an advancement, but still makes some pretty clear mistakes and does some fairly faulty reasoning. It especially seems to have a hard time with causal ordering, and reasoning about dependencies in a distributed system. Frequently it gets the relationships backwards, leading to hilarious code examples.