GitHub Is Down

Posted by cosuhi 2 days ago

Counter86Comment55OpenOriginal

Comments

Comment by bob1029 2 days ago

This has gotten to a point where it doesn't really matter anymore. When a service crosses a certain reliability threshold it's like a phase change. The customer base eventually adapts to the situation. Anyone who still genuinely cares has moved to self hosted enterprise or something else by now. It was most tenuous for me when they almost met the SLA. Now that they've blown so far beyond it, the stress is mostly gone.

Comment by JsonDemWitOster 2 days ago

I care but I can't move out because it's Orders from Up High which migrated us to Github. We haven't been here a year and it's not worth my neck to mirror our code in some kind of skunkworks forge instance. Woe. Woe is me.

Comment by exhilaration 2 days ago

Our Big Giant Enterprise wants to move all our repos from ADO to GitHub for all the nifty AI features, but I'm told the frequent downtime is a major issue so we're slow-walking the migration.

Comment by kelseydh 2 days ago

We have so many automated workflows and pipelines moving through Github Actions + other Github integrations it would be a giant headache to migrate. Not clear where we would go either. Gitlab??

Comment by SOLAR_FIELDS 1 day ago

I’ve been running an SMB trending mid market out of gitlab for the greater part of a year with no issues. One of my favorite benefits is that ci runners are colocated with the self hosted instance on k8s so suddenly a whole huge slew of shit that you had to care about with GitHub, security, provenance and supply chain is just… not an issue.

Getting off the GitHub actions dependency is a feature, not a bug

Comment by PeterStuer 1 day ago

Of course. Every non trivial migration is a 'headache'. But so is being down too many times.

At the very least explore and prepare for alternatives. Map out dependencies that are not trivial to replace. There's probably fewer of those than you think.

Comment by darkwater 2 days ago

Am I hallucinating or did they do a "cleanup" of the GH status in their status page? https://www.githubstatus.com/

API Requests with 4 nines of availability??

Issues with 99.96 uptime?

PR with 99.61% uptime last 90 days??

https://mrshu.github.io/github-statuses/ marks PRs at 95.89% in the same period as an example.

Comment by kaelwd 2 days ago

https://www.githubstatus.com/incidents/x69zbgdyfzg0 took three days to resolve and isn't being counted as an outage on the official page.

Comment by 2 days ago

Comment by craigmart 2 days ago

I had checked that page not long ago, and as far as I remember there were many "red" or "orange" days in the past 3 months. Now it's all green. That's concerning

Comment by tosti 2 days ago

If they've actually improved the service, it would be fine. Is there reasonable plausability, reasonable doubt, or reasonable suspicion?

Comment by aweiher 1 day ago

it should not change any historical records

Comment by alfg 2 days ago

Seems like if they changed the criteria of downtime. If you hover over the individual days you’ll see lot of degraded messages, but still green.

Comment by 2 days ago

Comment by hsbauauvhabzb 2 days ago

I clicked through wayback machine and couldn’t see any strong indicators that uptime had been rewritten, but there are a lot of snapshots. If you can prove it, I’m interested.

Comment by darkwater 2 days ago

As said, maybe I'm really hallucinating and mixed up the official and unofficial GH status pages, but there are definitely incidents published in the official one that are not counted as downtime or even partial degradation in the uptime counter.

Comment by coolgoose 2 days ago

They have an incident reported today, but the status page for actions shows green :D that's fun.

Comment by Zealotux 2 days ago

Soon we'll have to track when Github is up.

Comment by bernds74 2 days ago

The London Underground have been doing it in reverse for a long time. "There is good service on the Piccadilly line."

Comment by Xymist 1 day ago

The phrasing they chose is particularly amusing, since it's always false regardless of the day. They could have said "usual service", and left in the plausible acknowledgement of the fact that it's hot, cramped, miserable, grubby, loud, plague-ridden, mouse-ridden, and generally unpleasant.

Comment by JsonDemWitOster 2 days ago

The Google SRE book offers the following as one of the reasons to not gun for 100% reliability (emphasis added):

> users typically don’t notice the difference between high reliability and extreme reliability in a service, because the user experience is dominated by less reliable components like the cellular network or the device they are working with. Put simply, a user on a 99% reliable smartphone cannot tell the difference between 99.99% and 99.999% service reliability!

I've been on a shaky relationship with my ISP of late. What brought me to this thread today is that I couldn't push to Github. Notably this isn't covered by their downtime report so, going by the available facts, it's _probably_ not Github's fault I couldn't push; and I've just been on my daily stand-up call and I got disconnected so frequently.

But looking beyond today's available facts, odds are there's a bigger problem GH is not mentioning in their status page. They say the current incident has to do with "unauthorized users" and I wonder if pushing a commit from my IDE client counts as an operation from an "unauthorized user" as I still have to authorize with my SSH key.

It's just insane I can't decide which between Github or German o2 should be the more reliable service!

Comment by ownagefool 2 days ago

Github isn't having a debate over how many 9s they have, they're having a zero 9s problem.

I think there's 3 big themes with this, thought not

1. LLM tools have added considerable load.

2. LLM used by developers to increase velocity seem to be leading more outages. This calls into question the increased velocity.

3. Roadmaps focused on pushing features that aren't reliability problems. i.e. github moving to azure, or adding AI features.

All these same problems happen to orgs with other fads that aren't AI. Following fads is not good engineering.

Comment by Grombobulous 1 day ago

Your comment made me think: if GitHub was a Google product with similar popularity and scaling trajectory, would we see similar reliability issues?

Absolutely not. Google has reliability practices so deeply ingrained in their company they’re like an involuntary reflex.

This is a management issue.

Comment by PeterStuer 1 day ago

So they failed to manage growth. That is a business management problem first, and only a technical problem second. Yet Github management seems to constantly deflect to operations.

If you take on load (this is 100% by choice) beyond capacity, then obviously the system collapses.

Comment by rurban 1 day ago

Nope. It's entirely azure management fault. https://isolveproblems.substack.com/p/how-microsoft-vaporize...

Comment by kelseydh 2 days ago

Apparently Github is experiencing a huge increase in usage due to LLMs and this is the cause for a lot of their instability as of late.

Comment by PeterStuer 1 day ago

'Experiencing' makes it sound like they have no deliberate choice. No, they let this happen, by choice. They could have prevented this, contractually, by pricing, by governance, but chose not to.

Comment by IshKebab 2 days ago

> Put simply, a user on a 99% reliable smartphone cannot tell the difference between 99.99% and 99.999% service reliability!

Sure they can. If Google loads and Github doesn't, then it's clearly Github being down, not the mobile network.

Also not everyone uses a phone. My desktop & fibre internet has way better than 99% reliability.

Comment by spondyl 2 days ago

"unauthorized" is a bit different than "unauthenticated". The former suggests trying to access something you don't have permission for while the latter suggests you're just not logged in.

At a guess, I could imagine some sort of failure of cached pages, which can be cached for signed out users but probably not for signed in users (as the rendered HTML would need to have user context like their avatar etc)

Comment by DamonHD 2 days ago

First time that GitHub has ever been down for me (London, UK):

    504 Gateway Time-out

    The server didn't respond in time.

Comment by bonesss 2 days ago

I just got the same in Western EU, refreshed and got a site, and now it's gone again with intermittent hiccups of life.

Honestly it's pretty mad to see, especially without a crisp failover.

Comment by DamonHD 2 days ago

And it came back for me, at least for a refresh or two...

Comment by witx 2 days ago

Slophub is vibing strong

Comment by exploraz 2 days ago

> Following investigation, we are seeing that impact is limited to unauthenticated users when accessing Pull Requests or Issues. Our team continues to work towards mitigation with more updates to follow as we have them.

Comment by deviation 2 days ago

We should have a pinned post, just for GitHub outages.

Comment by Rohunyyy 2 days ago

GitHub is down for like the nth time this week? Time for some truly ground breaking statements like this is the result of LLM (yes it is I know)

Comment by JsonDemWitOster 2 days ago

Well thankfully it's Monday so they've only been down for the _first_ time this week!

Comment by Keyframe 2 days ago

What's going on with GitHub lately? I've never seen them having so much issues over the years as they do have now.

Comment by orwin 2 days ago

Microsoft management/engineering practices+AI slop.

Comment by phplovesong 2 days ago

That, but also AI tools have made github req/res go up by 100x. There is simply too muvh AI generated traffic.

I k ow for a fact that ANY other platform would fail faster than github if they had the same volume of http requests.

Comment by Keyframe 2 days ago

I get that, but all of that was in there for quite some time though, no?

Comment by netdevphoenix 2 days ago

Digital systems don't necessarily deteriorate immediately after the causal factors. Like technical debt, issues grow unnoticed and become visible gradually.

Comment by cyberjunkie 2 days ago

Waiting for someone to say this was always the case, not just post Microsoft's takeover.

Comment by ramon156 2 days ago

Welcome back everyone

Comment by phplovesong 2 days ago

People seem to miss entirely that this is not (only) some slop code that makes github go down, but its the fact that they get 100x the number of requests since AI tools came to the devs daily workflow.

Comment by sscaryterry 2 days ago

Now self-hosting, left the dumpster fire.

Comment by yanhangyhy 2 days ago

shit i thought it was because my browser. sorry firefox..

why i am keep seeing github down news in HN?

Comment by exploraz 2 days ago

Simply because of being a dependency of many things.

https://www.githubstatus.com/history

https://isgithubcooked.com/

Comment by herrkanin 2 days ago

Because majority of the audience spend large parts of their day on GitHub and it keeps going down.

Comment by forlorn 2 days ago

Ah, here we go again. A weekly (daily?) github is down thread

Comment by rvz 2 days ago

Once again, GitHub is down. Last time an incident happened was 2 days ago [0].

Could not have happened on a worse day (Monday) and you can see how unreliable GitHub has been.

Better of self-hosting.

[0] https://news.ycombinator.com/item?id=48418183

Comment by bigpeopleareold 2 days ago

In my employment from the last 3 years, I saw twice that there were two migrations from internal systems to GitHub. I would think that companies are doing this for cost-cutting measures. It's not like something I am going to research but it'd be interesting that their recent issues are related to large migrations from in-house installations to github and doubly, if that is related to how large companies might be tightening up their spending in the past few years.

Comment by m-schuetz 2 days ago

Anything I'd self-host would be down way more often than github.

Comment by ta8903 2 days ago

Only for a scale approaching github's, otherwise a gitea instance or whatever doesn't have any interdependent components other than the server you host it on, which won't have nearly as low a downtime as github (though that's a low bar, a better way to phrase it would be saying it would pretty much never be down).

Comment by PeterStuer 1 day ago

My on prem forgejo is way less down than github. I use both. You don't have to chose.

Comment by bigstrat2003 1 day ago

I self host my own VCS repos. It's never down. It's not rocket surgery to self host, even for business stuff rather than personal.

Comment by lmm 2 days ago

Monday is probably the best day for an outage? Everyone's in the office, so you're not disturbing them, and hung over, so you're not hurting their productivity much.