September 21, 2024

Fastly outage: Why it just broke Amazon, Reddit, Twitch and much of the internet

Fastly #Fastly

a close up of a keyboard: Not Fastly's proudest moment. Peter Dazeley/Getty © Provided by CNET Not Fastly’s proudest moment. Peter Dazeley/Getty

Today will be remembered as the day the internet broke — before swiftly being fixed again. On Tuesday morning, many of the websites we rely on daily, including Amazon, Reddit, Twitch, Pinterest and, unfortunately, CNET went offline due to a major outage at a service called Fastly. Everywhere you looked, there were 503 errors and people complaining they couldn’t access key services and news outlets, demonstrating just how much of the internet relies on this largely unheard-of cloud computing service.

a close up of a keyboard: Well, the internet was good while it lasted. © Peter Dazeley/Getty

Well, the internet was good while it lasted.

At around 2:58 a.m. PT, Fastly’s status update page noted an error, saying “we’re currently investigating potential impact to performance with our CDN [content delivery network] services.” Shortly thereafter, reports emerged on Twitter of major news publications including the BBC, CNN and The New York Times being offline. Twitter itself was still running, although the server that hosted its emojis went down, leading to some odd-looking tweets.

Rather than isolated incidents affecting individual sites, it turned out this was a massive outage that had brought much of the internet to its knees. Across the world, people were receiving Error: 503 messages as they tried to access sites, including some vital services, such as the UK government’s gov.uk web properties.

Almost an hour later, at 3:44 a.m. PT — or 6:44 a.m. ET, on the cusp of the US East Coast workday, and coming up on noon in the UK — Fastly updated its status page again to say the issue has been identified and a fix was being implemented. At 4:10 a.m. PT, the company tweeted: “We identified a service configuration that triggered disruptions across our POPs globally and have disabled that configuration. Our global network is coming back online.”

The same message was sent to CNET as a comment by Fastly spokespeople.

What is Fastly?

Fastly is a cloud computing service provider, headquartered in San Francisco, that’s been around since 2011. In 2017, it launched an edge cloud platform designed to bring websites closer to the people who use them. Effectively this means that if you’re accessing a website hosted in another country, it will store some of that website closer to you so that there’s no need to waste bandwidth by going to fetch all of that website’s content from far away every time you need it.

This makes for faster website load times, and optimizes images, videos and other high-payload content to show up quickly and smoothly when you land on a web page. Among the boasts on the company’s website, it says it made loading pages on Buzzfeed 50% faster and allowed The New York Times to simultaneously handle 2 million readers on election night. Edge computing also performs vital cybersecurity functions, protecting sites from DDoS attacks and bots, as well as providing a web application firewall.

Due to the way Fastly sits between the back-end web servers and the front-facing internet as we see it, any errors on its part can cause whole websites to be unavailable. Due to the localized nature of the edge cloud platform, it also means that errors don’t affect all regions in the same way at the same time (although people all across the world reported experiencing problems on Tuesday).

What is a 503 error?

When you see a website displaying a 503 error rather than showing you the page you were expecting, it means the server hosting the website isn’t ready to handle the request. It also indicates that the problem is temporary and that it will likely be resolved soon.

Commonly, it is caused when a server is down for maintenance, or when a website has been overloaded — for example, if too many people are trying to access it at once.

Why did Fastly fail on Tuesday? text, table, email: Fastly issues service updates throughout the outage. Screenshot/CNET © Provided by CNET Fastly issues service updates throughout the outage. Screenshot/CNET

We know that Tuesday’s internet outage was caused by a “service configuration,” but not much more than that right now. Until Fastly investigates fully, it’ll be hard to declare the root cause of the catastrophic failure. It’s important to note that it’s not necessarily a cybersecurity attack, as many people have speculated on Twitter. There are many technical reasons a CDN can fail, and cyberattacks are just one of them.

Why were so many websites affected by the Fastly outage?

Fastly is a widely used service by web publishers — and it became apparent exactly how widely used on Tuesday when vast swaths of the internet became unavailable. 

The reason it’s so popular is that the services it provides are considered essential by many online web properties, but not many companies provide these services. As such, a vast number of websites are reliant on a very small group of companies to keep running. Similar problems were seen when Cloudflare was hit with an outage last July, and when Amazon Web Services went down last November.

As Corinne Cath-Speth, a Ph.D. candidate at Oxford Internet Institute and the Alan Turing Institute pointed out on Twitter, this means “a technical hiccup in a single company can have huge ramifications.”

“This in turn — raises major questions about the dangers of (power) consolidation in the cloud market and the unquestioned influence these often invisible actors have over access to information,” she added.

Leave a Reply