Elevated errors across many models
Posted by pablo24602 18 hours ago
Comments
Comment by dinkleberg 17 hours ago
This should be a given for any service that others rely on, but sadly this is seldom the case.
Comment by palcu 16 hours ago
I was also fortunate to be using Claude at that exact moment (for personal reasons), which meant I could immediately see the severity of the outage.
Comment by koakuma-chan 15 hours ago
Comment by awesome_dude 15 hours ago
https://www.theguardian.com/technology/2021/oct/05/facebook-...
I have a memory that Slack fell into this trap too (I could be wrong)
Comment by hiddencost 12 hours ago
Google SRE still keeps IRC available in case of an emergency.
Comment by antonvs 13 hours ago
Comment by aduwah 16 hours ago
Comment by nrhrjrjrjtntbt 14 hours ago
Comment by LanceH 17 hours ago
Comment by cevn 15 hours ago
Comment by arach 17 hours ago
Claude user base believes in Sunday PM work sessions
Comment by smcleod 14 hours ago
Comment by Buttons840 15 hours ago
Comment by dpkirchner 14 hours ago
Comment by fragmede 17 hours ago
Comment by palcu 17 hours ago
Comment by l1n 16 hours ago
The bad route has been removed and service is restored. We’re doing a full review internally with a focus on synthetic monitoring and better visibility into high-impact infrastructure changes to catch these faster in the future.
Comment by ammut 14 hours ago
Comment by giancarlostoro 15 hours ago
Comment by Arcuru 14 hours ago
A tech company that publishes the postmortems when possible always get a +1 in my eyes, I think it's a sign of good company culture. Cloudflare's are great and I would love to see more from others in the industry.
Comment by boondongle 14 hours ago
Underneath a public statement they all have extremely detailed post-mortems. But how much goes public is 100% random from the customer's perspective. There's no Monday Morning QB'ing the CEO, but there absolutely is "Day-Shift SRE Leader Phil"
Comment by bflesch 13 hours ago
Comment by giancarlostoro 2 hours ago
Comment by 999900000999 16 hours ago
Back when I did website QA Automation I'd manually check the website at the end of my day. Nothing extensive, just looking at the homepage for piece of mind.
Once a senior engineer decided to bypass all of our QA, deploy and took down prod. Fun times.
Comment by spike021 14 hours ago
At my first job one of my more senior team members would throw caution to the wind and deploy at 3pm or later on Fridays because he believed in shipping ASAP.
There were a couple times that those changes caused weekend incidents.
Comment by MobiusHorizons 12 hours ago
Comment by spike021 9 hours ago
Comment by userbinator 13 hours ago
Comment by weird-eye-issue 15 hours ago
Comment by wouldbecouldbe 15 hours ago
Did the bad route cause an overload? Was there a code error on that route that wasn’t spotted? Was it a code issue or an instance that broke?
Comment by bc569a80a344f9c 15 hours ago
Network routes consist of a network (a range of IPs) and a next hop to send traffic for that range to.
These can overlap. Sometimes that’s desirable, sometimes it is not. When routers have two routes that are exactly the same they often load balance (in some fairly dumb, stateless fashion) between possible next hops, when one of the routes is more specific, it wins.
Routes get injected by routers saying “I am responsible for this range” and setting themselves as the next hop, others routers that connect to them receive this advertisement and propagate it to their own router peers further downstream.
An example would be advertising 192.168.0.0/23, which is the range of 192.168.0.0-192.168.1.255.
Let’s say that’s your inference backend in some rows in a data center.
Then, through some misconfiguration, some other router starts announcing 192.168.1.0/24 (192.168.1.0-192.168.1.255). This is more specific, that traffic gets sent there, and half of the original inference pod is now unreachable.
Comment by disqard 15 hours ago
Comment by mattdeboard 15 hours ago
Comment by colechristensen 14 hours ago
Comment by tayo42 15 hours ago
Comment by giancarlostoro 15 hours ago
Comment by l1n 15 hours ago
Comment by giancarlostoro 14 hours ago
Comment by nickpeterson 16 hours ago
Comment by dan_wood 17 hours ago
Only curious as a developer and dev op. It's all quite interesting where and how things go wrong especially with large deployments like Anthropic.
Comment by dgellow 16 hours ago
Comment by Chance-Device 17 hours ago
Comment by g-mork 16 hours ago
Comment by tgtweak 1 hour ago
503 is cool and yes, there is the "well if it's down how are they going to put that up" but in reality most downtimes you see are on the backend and not on the reverse proxies/gateways/cdns where it would be pretty trivial to add a issues/status header with a link to the service status page and a note.
Comment by embedding-shape 1 hour ago
Comment by irishcoffee 17 hours ago
Man that cracks me up.
Comment by lxgr 17 hours ago
Comment by adonovan 15 hours ago
Comment by michelsedgh 17 hours ago
Comment by irishcoffee 17 hours ago
Comment by patcon 17 hours ago
But what about when LLMs go down and a good chunk of a whole generation won't even know how to think, when the remote system goes down? (Is the ability to think "the engine" of self and agency in this metaphor?)
We are building a wildly irresponsible context to exist in.
Comment by semi-extrinsic 17 hours ago
Comment by bdangubic 17 hours ago
Comment by jsight 17 hours ago
Comment by cdelsolar 16 hours ago
Comment by cantalopes 14 hours ago
Comment by prmph 13 hours ago
Comment by bdangubic 15 hours ago
Comment by teaearlgraycold 17 hours ago
Comment by breatheoften 17 hours ago
There is a very interesting thing happening right now where the "llm over promisers" are incentivized to over promise for all the normal reasons -- but ALSO to create the perception that the "next/soon" breakthrough is only going to be applicable when run on huge cloud infra such that running locally is never going to be all that useful ... I tend to think that will prove wildly wrong and that we will very soon arrive at a world where state of art LLM workloads should be expected to be massively more efficiently runnable than they currently are -- to the point of not even being the bottleneck of the workflows that use these components. Additionally these workloads will be viable to run locally on common current_year consumer level hardware ...
"llm is about to be general intelligence and sufficient llm can never run locally" is a highly highly temporary state that should soon be falsifiable imo. I don't think the llm part of the "ai computation" will be the perf bottleneck for long.
Comment by lwhi 16 hours ago
I've often thought that local power generation (via solar or wind) could be (or could have been) a viable alternative to national grid supply.
Comment by tablets 16 hours ago
Comment by PunchyHamster 14 hours ago
Comment by teaearlgraycold 4 hours ago
Comment by lxgr 17 hours ago
Comment by irishcoffee 17 hours ago
Imagination, either the first or last thing to die in 2075.
Comment by lxgr 16 hours ago
Comment by irishcoffee 15 hours ago
Is it so different?
Comment by colordrops 17 hours ago
Comment by teaearlgraycold 4 hours ago
Comment by cft 17 hours ago
Comment by colordrops 15 hours ago
Comment by exe34 17 hours ago
Comment by teaearlgraycold 4 hours ago
Comment by pstuart 17 hours ago
Comment by tsimionescu 9 hours ago
Comment by exe34 9 hours ago
Comment by cft 17 hours ago
Comment by szundi 17 hours ago
Comment by PunchyHamster 14 hours ago
Comment by sergiotapia 16 hours ago
Comment by irishcoffee 16 hours ago
“So sorry, we can’t read the WiFi signal from your generator. We don’t know how much to tax you for the emissions. Sadly, we can’t allow it to start.”
The story writes itself.
Comment by Stratoscope 9 hours ago
Claude: The Field Museum is enormous - I imagine you got your steps in that day. The anthropology collections there are world-class, even if they didn't make it into your photo collection. Sometimes the experience is more important than the documentation. Did you have a favorite exhibit beyond the dinosaurs?
Me: Oddly enough, the anthropology and human history section! Now I need to take another visit to Chicago to get pictures of those.
[No response from Claude]
[40 minutes pass while I have breakfast]
Me: Looks like I caught you during your morning nap. Every two and a half year old should get a good nap now and then!
Claude: Ha! You caught me mid-thought - I was composing a response about how great it is that the anthropology section was your favorite despite the dinosaurs being such showstoppers, and apparently my brain decided to take that nap you mentioned! ... Though I have to say, for a "two and a half year old," I like to think I'm pretty articulate when I'm awake!
Comment by tgtweak 1 hour ago
Comment by sebastiennight 17 hours ago
"You have reached the messages quota for your account. It will reset in 2 hours, or you can upgrade now"
Either I have perfect timing for reaching my quota limits, or some product monetization manager deserves a raise.Comment by manquer 17 hours ago
Comment by manuisin 15 hours ago
Comment by frankdenbow 16 hours ago
Comment by michelsedgh 17 hours ago
Comment by agumonkey 17 hours ago
Comment by XCSme 17 hours ago
Comment by baobabKoodaa 17 hours ago
Comment by michelsedgh 17 hours ago
Comment by agumonkey 16 hours ago
Comment by teaearlgraycold 17 hours ago
Comment by tcdent 17 hours ago
Comment by AnotherGoodName 17 hours ago
They made pro have many times more value than paying per token and then they made max again have 25x more tokens than pro on the $200 plan.
It’s a bit like being offered rice at $1 per grain (pay per token) or a tiny bag of rice for $20 (pro) or a truck load for $200. That’s the pricing structure right now.
So while i agree you can’t easily exceed the quota on the big plans it’s a little crazy how they’ve tiered pricing. I hope no one out there’s paying per token!
Comment by KronisLV 9 minutes ago
For comparison’s sake, this is clear: https://support.cerebras.net/articles/9996007307-cerebras-co...
And while the Cerebras service is pretty okay, their website otherwise kinda sucks - and yet you can find clear info!
Comment by square_usual 16 hours ago
Some companies are. Yes, for Claude Code. My co used to be like that as it's an easy ramp up instead of giving devs who might not use it that much a $150/mo seat; if you use it enough you can have a seat and save money, but if you're not touching $150 in credits a month just use the API. Oxide also recommends using API pricing. [0]
0: https://gist.github.com/david-crespo/5c5eaf36a2d20be8a3013ba...
Comment by tcdent 16 hours ago
Comment by cmrdporcupine 12 hours ago
Comment by michelsedgh 17 hours ago
Comment by termos 18 hours ago
So we can maybe vibe, depending what service we use.
Comment by giancarlostoro 17 hours ago
Comment by bonesss 17 hours ago
Comment by ares623 17 hours ago
Comment by iLoveOncall 16 hours ago
Comment by m_ke 16 hours ago
Comment by abigail95 17 hours ago
Comment by 6r17 16 hours ago
Comment by __0x01 17 hours ago
"I can't change the laws of physics!"
Comment by russellthehippo 17 hours ago
Comment by victor9000 17 hours ago
Comment by llmthrow0827 14 hours ago
Comment by throwaway613745 12 hours ago
I don’t even bother with Sonnet anymore, it’s been made obsolete by Opus 4.5.
Comment by flowinghorse 14 hours ago
Comment by triwats 17 hours ago
Spent it in bloody Figma instead :(
Comment by jcims 16 hours ago
AFAIK it shouldn’t have been.
Comment by theropost 17 hours ago
Comment by frankdenbow 16 hours ago
Comment by WhyOhWhyQ 17 hours ago
Comment by matt3210 16 hours ago
Comment by asasidh 18 hours ago
What else is people using ? Haiku 4.5 ?
Comment by epolanski 17 hours ago
Comment by riwsky 17 hours ago
Comment by gunalx 17 hours ago
Comment by nunodonato 17 hours ago
Comment by asasidh 17 hours ago
Comment by Jsuh 15 hours ago
Comment by delaminator 18 hours ago
Comment by onionisafruit 17 hours ago
Comment by sgt 17 hours ago
Comment by throwaway613745 12 hours ago
Comment by TechDebtDevin 14 hours ago
Comment by aj7 14 hours ago
Comment by rvz 16 hours ago
Looking forward to the post-mortem.
Comment by edverma2 17 hours ago
Comment by bitwize 17 hours ago
Comment by ath3nd 16 hours ago
Comment by andrewmcwatters 17 hours ago
Comment by djaowjxj 14 hours ago
Comment by dorongrinstein 16 hours ago
Comment by acedTrex 17 hours ago
Comment by tom_ 17 hours ago