Claude Opus 4.7
Posted by AlphaWeaver 1 day ago
Comments
Comment by ChrisArchitect 1 day ago
Comment by tomhow 1 day ago
Comment by AlphaWeaver 1 day ago
Comment by jameson 1 day ago
For example, SWE-bench Pro improved ~11% compared with Opus 4.6. Should one interpret it as 4.7 is able to solve more difficult problems? or 11% less hallucinations?
Comment by constantius 1 day ago
Comment by rvz 1 day ago
Given that no-one is talking about DeepSeek, I assume it is coming this month.
They are still releasing research papers and that is what really matters and not the .1 increment releases of AI models to massage benchmarks or create hype around.
Comment by cmrdporcupine 1 day ago
They're either stuck/dead or they're sitting on something really fantastic that they only want to release once they've perfected it.
My realistic side thinks the former, my optimism on the latter.
In the meantime, GLM 5.1 is actually really good.
Comment by bsaul 1 day ago
Comment by cmrdporcupine 1 day ago
Comment by vomayank 1 day ago
Are you seeing meaningful improvements in reasoning reliability, or mostly incremental quality changes compared to previous releases?
Comment by grandinquistor 1 day ago
Comment by hansmayer 1 day ago
Comment by pukaworks 23 hours ago