Opus 4.5 is the first model that makes me fear for my job

Posted by nomilk 1 day ago

Counter38Comment74OpenOriginal

Comments

Comment by jchw 1 day ago

Remember when GPT-3 came out and everybody collectively freaked the hell out? That's how I've felt watching the reaction to any of the new model releases lately that make any progress.

I'm honestly not complaining about the model releases, though. Despite their shortcomings, they are extremely useful. I've found Gemini 3 to be an extremely useful learning aid, so as long as I don't blindly trust its output, and if you're trying to learn, you really ought not do that anyways. (Despite what people and benchmarks say, I've already caught some random hallucinations, it still feels like you're likely to run into hallucinations on a regular basis. Not a huge problem, but, you know.)

Comment by techblueberry 1 day ago

It feels like every model release has its own little hype cycle. Apparently Claude 4.5 is still climbing to its peak of inflated expectations.

Comment by krackers 1 day ago

There's lots of overlap between the cryptocurrency space and AI grifter hypeman space. And the economic incentives at play throw fuel on the fire.

Comment by Forgeties79 1 day ago

They behave like televangelists

Comment by pogue 1 day ago

I hear that a lot, but I think this is becoming very different than the crypto grift.

Crypto was just that, a pure grift where they were creating something out of nothing and rugpulling when the hype was highest.

AI is actually creating something, it's generating replacement for artists, for creatives, for musicians, for writers, for programmers. It's literally capable of generating something from _almost_ nothing. Of course, you have to factor in energy usage & etc, but the end user sees none of that. They type a request and it generates an output.

It may be easily identifable slop today, but it's getting better and better at a RAPID rate. We all need to recognize this.

I don't know what to do with the knowledge that it's coming for our jobs. Adapt or die? I don't know...

Comment by krackers 1 day ago

I don't disagree that there is value behind LLMs. But I was referring to the grifter style of AI evangelism (of which the strawberry man might be the epitome), who are derliberately pumping up and riding on the bubble. Probably they're profiting off of the generated social media engagement, or part of some social media influencing campaign indirectly paid by companies who do benefit from the bubble.

The common thread is that there's no nuanced discussion to be found, technical or otherwise. It's topics optimized for viral engagement.

Comment by pogue 1 day ago

The strawberry man... ¯\_(ツ)_/¯

I see what you're saying, that's a bit of a different aspect entirely. I don't know how much people are making from viral posts on Twitter (or fb?) from that kind of thing.

But, outside of those specific platforms, there's quite a bit of discussion on it on reddit and on here has had some of the best. The good tech sites like Ars, Verge, Wired, Register all have excellent realistic coverage of what's going on.

I think if you're only seeing hype I'd ask where you're looking. And on the flip side, there's the very anti-ai crowd who I'm sure might be getting that same kind of reach to their target audience preaching the evils & immortality of it.

Comment by techblueberry 1 day ago

Very meta post.

Comment by pogue 21 hours ago

I was feeling very Zuckerbergian

Comment by prymitive 1 day ago

There are still a few things missing from all models: taste, shame and ambition. Yes they can write code, but they have no idea what needs does that code solve, what a good UX looks like and what not to ship. Not to mention that they all eventually go down rabbit holes of imaginary problems that cannot be solve (because they’re not real), and do where they will spend eternity unless w human says stop it right now.

Comment by odla 1 day ago

While I agree, this is also true of many engineers I’ve met.

Comment by heavyset_go 1 day ago

They have a severe lack of wisdom, as well.

Comment by channel_t 1 day ago

Almost every single post on the ClaudeAI subreddit is like this. I use Opus 4.5 in my day to day work life and it has quickly become my main axe for agentic stuff but its output is not a world-shattering divergence from Anthropic's previous, also great iterations. The religious zealotry I see with these things is something else.

Comment by epolanski 1 day ago

I suspect that recurring visitors of that subreddit may not be the greatest IT professionals, but a mixture of juniors (even those with 20 years of experience but still junior) and vibe coders.

Otherwise, with all due respect, there's very little of value to learn in that subreddit.

Comment by channel_t 1 day ago

100%. I would also say that this broadly applies to pretty much all of the AI subreddits, and much of AI Twitter as well. Very little nuanced or thoughtful discussions to be found. Looks more like a bunch of people arguing about their favorite sports teams.

Comment by unsupp0rted 22 hours ago

This exactly. The /r/codex subreddit is equally full of juniors and vibe-coders. In fact, it's surprisingly a ghost-town, given how useful Codex CLI is.

Comment by quantumHazer 1 day ago

Why are we commenting the Claude subreddit?

1) it’s not impartial

2) it’s useless hype commentary

3) it’s literally astroturfing at this point

Comment by heavyset_go 1 day ago

This is the new goalpost now that the "this model is so intelligent that it's sentient and dangerous" AGI hype has died down.

Comment by hecanjog 1 day ago

I used claude code for a while in the summer, took a vacation from LLMs and I'm trying it out again now. I've heard the same thing about Opus 4.5, but my experience with claude code so far is the same as it was this summer... I guess if you're a casual user don't get too excited?

Comment by yellow_lead 1 day ago

I tried it and I'm not impressed.

In threads where I see an example of what the author is impressed by, I'm usually not impressed. So when I see something like this, where the author doesn't give any examples, I also assume Claude did something unimpressive.

Comment by nharada 1 day ago

It definitely feels like a jump in capability. I've found that the long term quality of the codebase doesn't take nosedive nearly as quickly as earlier agentic models. If anything it's about steady or maybe even increasing if you prompt it correctly and ask for "cleanup PRs"

Comment by markus_zhang 1 day ago

Ironically AI may replace SWE way faster than it does for any other businesses in Stone Age.

Pick anything else you have a far better likelihood to fall back into manual process, legal wall, or whatever that AI cannot replace easily.

Good job boys and girls. You will be remembered.

Comment by pton_xd 1 day ago

I have to say, it was fun while it lasted! Couldn't really have asked for a more rewarding hobby and career.

Prompting an AI just doesn't have the same feeling, unfortunately.

Comment by sunshowers 1 day ago

It depends. I've been working on a series of large, gnarly refactors at work, and the process has involved writing a (fairly long), hand-crafted spec/policy document. The big advantage of Opus has been that the spec is now machine-executable -- I repeatedly fed it into the LLM and see what it did on some test cases. That sped up experimentation and prototyping tremendously, and it also found a lot of ambiguities in the policy document that were helpful to address.

The document is human-crafted and human-reviewed, and it primarily targets humans. The fact that it works for machines is a (pretty neat) secondary effect, but not really the point. And the document sped up the act of doing the refactors by around 5x.

The whole process was really fun! It's not really vibe coding at that point, really (I continue to be relatively unimpressed at vibe coding beyond a few hundred lines of code). It's closer to old-school waterfall-style development, though with much quicker iteration cycles.

Comment by cmarschner 1 day ago

For me it‘s the opposite. I do have a good feeling what I want to achieve, but translating this into and testing program code has always been causing me outright physical pain (and in case of C++ I really hate it). I‘ve been programming since age 10. Almost 40 years. And it feels like liberation.

It brings the “what to build“ question front and center while “how to build it“ has become much, much easier and more productive

Comment by markus_zhang 1 day ago

Indeed. I still use AI for my side projects, but strictly limit to discussion only, no code. Otherwise what is the point? The good thing about programming is, unlike playing chess, there is no real "win/lose" in the scenario so I won't feel discouraged even if AI can do all the work by itself.

Same thing for science. I don't mind if AI could solve all those problems, as long as they can teach me. Those problems are already "solved" by the universe anyway.

Comment by Hamuko 1 day ago

Even the discussion side has been pretty meh in my mind. I was looking into a bug in a codebase filled with Claude output and for funsies decided to ask Claude about it. It basically generated a "This thing here could be a problem but there is manual validation for it" response, and when I looked, that manual validation were nowhere to be found.

There's so much half-working AI-generated code everywhere that I'd feel ashamed if I had to ever meet our customers.

I think the thing that gives me the most value is code review. So basically I first review my code myself, then have Claude review it and then submit for someone else to approve.

Comment by markus_zhang 1 day ago

I don't discuss actual code with ChatGPT, just concepts. Like "if I have an issue and my algo looks like this, how can I debug it effectively in gdb?", or "how do I reduce lock contention if I have to satisfy A/B/...".

Maybe it's just because my side projects are fairly elementary.

And I agree that AI is pretty good at code review, especially if the code contains complex business logic.

Comment by skybrian 1 day ago

Already happened for copywriters, translators, and others in the tech industry:

https://www.bloodinthemachine.com/s/ai-killed-my-job

Comment by agumonkey 1 day ago

something in the back of my head tells me that automating (partial) intelligence feels different than automating a small to medium scope task, maybe i'm wrong though

Comment by harrall 1 day ago

I don’t think it’s ironic.

The commonality of people working on AI is that they ALL know software. They make a product that solves the thing that they know how to solve best.

If all lawyers knew how to write code, we’d seem more legal AI startups. But lawyers and coders are not a common overlap, surely nowhere as near as SWEs and coders.

Comment by agumonkey 1 day ago

time to become a solar installer

Comment by neoromantique 1 day ago

For the most part, code monkeys haven't been a thing for quite some time now, I'm sure talented people will adapt and find other avenues to flourish

Comment by Lionga 1 day ago

Dario Amodei claimed "AI will replace 90% of developers within 6 months" about a year ago. Still they are just loosing money and will probably will be forever while just producing more slop code that needs even more devs to fix it.

Good job AI fanboys and girls. You will be remembered when this fake hype is over.

Comment by markus_zhang 1 day ago

I'm more of a doomsayer than a fan boy. But I think it's more like "AI will replace 50% of your juniors and 25% of your seniors and perhaps 50% of your do-nothing middle managers", And that's a fairly large number anyway.

Comment by sidibe 1 day ago

100% in the doomers camp now, wish I could be as optimistic as these people who think AI is all hype but the last few weeks it's starting to finally be more productive to use these tools and I feel like this will be a short little window where the stuff I'm typing in and the review of whats coming out is still worth my salary.

I don't really see why anywhere near the number of great jobs this industry has had will be justifiable in a year. The only comfort is all the other industries will be facing the same issue so accomodations will have to be made.

Comment by markus_zhang 1 day ago

The other industries are shielded by legislations, unions and many more. Those who don’t and does not involve physical work are the first to fall.

Damn it that I’m only 40+ so I still need to work more or less 15 years even when we live frugally.

Comment by heckintime 1 day ago

I used Claude Code to write a relatively complicated watchOS app. I know how to program (FAANG L5), but didn't really know Swift. I achieved a pretty good result for about $600, while a contractor would've cost much more.

Comment by agumonkey 1 day ago

so how long until our salaries match those of an llm ?

Comment by heckintime 1 day ago

Good question. I left my job to start something on my own so an AI help is really nice. Should note that AI does make many boneheaded mistakes, and I have to solve some of the harder problems on my own.

Comment by fragmede 1 day ago

Isn't Claude Max only $200 how come you paid $600 for that?

Comment by abrichr 1 day ago

You can reach much higher spend through the API (which you can configure `$claude` to use)

Comment by exabrial 1 day ago

This just looks like an advertisement?

Comment by jsheard 1 day ago

Comment by quantumHazer 1 day ago

I wouldn’t be surprised if this is undisclosed PR from Anthropic

Comment by bachmeier 1 day ago

I'd be very surprised if it wasn't. Everything about that company turns me off. I've run across countless YouTube videos that are clearly Anthropic PR pretending to be real videos by regular people just trying it out and discovering how good Claude is. I'll stick with Gemini.

Comment by crystal_revenge 1 day ago

I've mainly been using Sonnet 4.5 so decided to give Opus 4.5 a whirl to see if could solve an annoying task I've been working on that Sonnet 4.5 absolutely fails on. Just started with "Are you familiar with <task> and can you help me?" and so far the response has been a resounding:

> Taking longer than usual. Trying again shortly (attempt 1 of 10)

> ...

> Taking longer than usual. Trying again shortly (attempt 10 of 10)

> Due to unexpected capacity constraints, Claude is unable to respond to your message. Please try again soon.

I guess I'll have to wait until later to feel the fear...

Comment by wdb 1 day ago

There are currently issues with the models. Claude Code doesn't work at all for me

Comment by diavelguru 1 day ago

I agree with this guy: https://obie.medium.com/what-happens-when-the-coding-becomes.... It's a game changer when used to pair program (MOB Programming) with you as the navigator and the LLM as the driver. It needs guidance and rework but shit gets done with the human as the ultimate gatekeeper.

Comment by giancarlostoro 1 day ago

I've been using Claude Code + Opus for side projects. The only thing that's changed for me dev wise is that I QA more, and think more about how to solve my problems.

Comment by uniclaude 1 day ago

That’s only tangentially related but I have a very hard time using Opus for anything serious. Sonnet is still much more useful to me thanks to the context window size. By the moment Opus actually understands what’s needed, I’m n compactions deep and pretty much hoping for the best.

That’s a reason why I can’t believe the benchmarks and why I also believe open source models (claiming 200 but realistically struggling past 40k) aren’t only a bit but very far behind SOTA in actual software dev.

This is not true for all software, but there are types of systems or environments where it’s abundantly clear that Opus (or anything with a sub 1m window) won’t cut it, unless it has a very efficient agentic system to help.

I’m not talking about dumping an entire code base in the context, I’m talking about clear specs, some code, library guidelines, and a few elements to allow the LLM to be better than a glorified autocomplete that lives in an electron fork.

Sonnet still wins easily.

Comment by orwin 1 day ago

Not me, but it's the first that for now didn't fail catastrophically on moderately difficult task. Not that the other models can't manage difficult tasks, but they sometimes generate stuff so wrong it's almost funny. For now (a week of usage), it never truly failed (the one time it couldn't find a working solution, it simply proposed solutions I could use and stopped, without generating bullshit code), and sometimes I don't have to change anything from the generated output.

It's definitely more useful than me the first 5 years of my professional career though, so for people who don't improve fast or for average new grades, this can be a problem.

Comment by Aperocky 1 day ago

It's almost vindication for where I work an SDE needs to do everything, infra, development, deployment, launch, operations. There's no dedicated QA, test or operations on a product level, and while AI helped a great deal it's pretty clear it cannot replace me at least within the next 2 to 3 iterations.

If I was only writing code, the fear would be completely justified.

Comment by themafia 1 day ago

Reads like astroturf to me.

> do not know what's coming for us in the next 2-3 years, hell, even next year might be the final turning point already.

What is this based on? Research? Data? Gut feeling?

> but how long will it be until even that is not needed anymore?

You just answered that. 2 to 3 years, hell, even next year, maybe.

> it also saddens me knowing where all of this is heading.

If you know where this is heading why are you not investing everything you have in these companies? Isn't that the obvious conclusion instead of wringing your hands over the loss of a coding job?

It invents a problem, provides a time line, immediately questions itself, and then confidently prognosticates without any effort to explain the information used to arrive at this conclusion.

What am I supposed to take from this? Other than that people are generally irrational when contemplating the future?

Comment by gtowey 1 day ago

We have reached the "singularity of marketing". It's what happens when an AI model has surpassed human marketers and traditional bot farms and can be used to do its own astroturfing. And then with the investment frenzy it generates, we can build the next generation of advertising intelligence and achieve infinite valuation!

Comment by kami8845 1 day ago

Same here. Using it this week and on Thursday I began to understand why Lee Sedol retired not long after being defeated by AlphaGo. For the stuff I'm good at, 3 months ago I was better than the models. Today, I'm not sure.

Comment by bgwalter 1 day ago

It's a Misanthropic propaganda forum. They even have Claude agree in the summary of the bot comments:

"The overwhelming consensus in this thread is that OP's fear is justified and Opus represents a terrifying leap in capability. The discussion isn't about if disruption is coming, but how severe it will be and who will survive."

My fellow Romans, I come here not to discuss disruption, but to survive!

Comment by Aayush28260 1 day ago

Honeslty I have a lot of friends who are studying SWE and they are saying the same thing do you guys think that if they do get replaced they'll still be needed to maintin the AI's.

Comment by iSloth 1 day ago

Not sure I’d be worried for my job, but it’s legitimately a significant jump in capabilities, even if other models attempt to fudge higher bench results

Comment by int32_64 1 day ago

I wonder if these coding models will be sustainable/profitable long term if the local models continue to improve.

qwen3-coder blew me away.

Comment by AndyKelley 1 day ago

flagged for astroturfing

Comment by simonw 1 day ago

> Sure, I can watch Opus do my work all day long and make sure to intervene if it fucks up here and there, but how long will it be until even that is not needed anymore?

Right: if you expect your job as a software developer to be effectively the same shape on a year or two you're in for a bad time.

But humans can adapt! Your goal should be to evolve with the tools that are available. In a couple of years time you should be able to produce significantly more, better code, solving more ambitious profiles and making you more valuable as a software professional.

That's how careers have always progressed: I'm a better, faster developer today than I was two years ago.

I'll worry for my career when I meet a company that has a software roadmap that they can feasibly complete.

Comment by th0ma5 1 day ago

I just wanted to say I think it is doing people a great disservice to advocate for these specific kinds of tools, but the last paragraph is a universally correct statement, seemingly permanently.

Comment by terabytest 1 day ago

How does Opus 4.5 compare to gpt-5.1-codex-max?

Comment by scosman 1 day ago

roughly, much better: https://www.swebench.com

Comment by 1 day ago

Comment by outside1234 1 day ago

OpenAI is burning through $60B a year in losses.

Something doesn't square about this picture: either this is the best thing since sliced bread and it should be wildly profitable, or ... it's not, and it's losing a lot of money because they know there isn't a market at a breakeven price.

Comment by simonw 1 day ago

They're losing money because they are in a training arms race. If other companies weren't training competitive models OpenAI would be making a ton of money by now.

They have several billion dollars of annual revenue already.

Comment by throw310822 1 day ago

I think it's also a cultural thing... I mean it takes time for companies and professionals to get used to the idea that it makes sense to pay hundreds of dollars per month to use an AI. That that expense (that for some is relatively affordable and for other can be a serious one) actually converts in much higher productivity or quality.

Comment by outside1234 1 day ago

Google is always going to be training a new model and are doing so while profitable.

If OpenAI is only going to be profitable (aka has an actual business model) if other companies aren't training a competitive model, then they are toast. Which is my point. They are toast.

Comment by ben_w 1 day ago

Is Google (and Meta) funding AI training from the profits of ad business, eating the losses in order to prevent pure-AI companies from making a profit, is this legal?

In principle, I mean. Obviously there's a sense in which it doesn't matter if they only get fined for cross-subsidising/predatory pricing/whatever *after* OpenAI et al run out of money.

I do think this is a bubble and I do expect most or all the players to fail, but that's because I think they're in an all-pay auction and may be incentivised to keep spending way past the break-even point just for a chance to cut their losses.

Comment by outside1234 1 day ago

Do we know Google is operating at a loss? It seem most likely to me that they are paying for the model development straight out of search where it is employed on almost every search

Comment by ben_w 23 hours ago

Fair question. That's the kind of "who knows?" which might make it hard to defeat them in litigation, unless Google staff have been silly enough to write it down in easy-to-find-during-discovery emails.

But as a gut-check, even if all the people not complaining about it are getting use out of any given model, does this justify the ongoing cost of training new models?

If you could delete the ongoing training costs of new models from all the model providers, all of them look a lot healthier.

I guess I have a question about your earlier comment:

> Google is always going to be training a new model and are doing so while profitable.

While Google is profitable, or while the training of new models is profitable?

Comment by paulddraper 1 day ago

Opus 4.5 is like a couple points higher then Sonnet 4.5 on the SWE benchmark.

Comment by _wire_ 1 day ago

"This model is so alive I want to donate a kidney to it!"