Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I notice that on the "Agentic Coding" benchmark cited in the article Sonnet 4 outperformed Opus 4 (by 0.2%), and under performs Opus 4.1 (by -1.8%).

So this release might change that consensus? If you believe the benchmarks are reflective of reality anyways.



> If you believe the benchmarks are reflective of reality anyways.

That's a big "if." But yeah, I can't tell a difference subjectively between Opus and Sonnet, other than maybe a sort of placebo effect. I'm more careful to write quality prompts when using Opus, because I don't want to waste the 5x more expensive tokens.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: