Amazon Web Services (AWS) said late Monday that it had resolved a widespread outage that left thousands of websites and apps offline worldwide for much of the day.
More than 1,000 platforms — including Snapchat and banks such as Lloyds and Halifax — were affected by problems in Amazon’s US-based cloud network. Downdetector, a global outage tracker, recorded more than 11 million user reports during the disruption.
Experts said the incident highlighted the risks of concentrating critical digital infrastructure in the hands of a few cloud giants.
Single fault sparks massive disruption
Professor Alan Woodward from the University of Surrey said the outage exposed how fragile the global internet has become. Many services rely on external systems beyond their control. “Even a minor human error can create worldwide disruptions,” he said.
The outage began around 07:00 BST on Monday, when users reported problems accessing platforms such as Fortnite and Duolingo.
By midday, Downdetector had logged over four million reports across 500 websites — double the normal weekday total. That figure later climbed above 11 million as additional platforms, including Reddit and Lloyds Bank, went offline.
By 23:00 BST, Amazon confirmed that all AWS services had returned to normal after engineers throttled parts of the network to address the root issue.
Cascading failures worsen the impact
Mike Chapple, an IT professor at Notre Dame University, compared the outage to a regional power grid failure. He said initial restorations may have triggered new problems before the core fault was fixed. “It’s like restoring flickering lights without repairing the wiring,” he said.
Amazon has not yet given a full explanation. In a brief update, the company said the problem appeared linked to DNS resolution in its DynamoDB API in the US-EAST-1 region.
DNS, or Domain Name System, acts as the internet’s directory, converting website names into numerical addresses computers can read. When DNS fails, browsers cannot locate websites, leaving users cut off entirely.
Concentration of cloud power raises alarms
Cloudflare CEO Matthew Prince said the outage demonstrated the risks of relying on a few dominant providers. “Everyone has a bad day, and today it was Amazon’s,” he said. “The cloud allows rapid growth, but a single failure can affect millions of users.”
Cori Crider, head of the Future of Technology Institute, compared the disruption to “a bridge collapsing in the digital economy.” She said around 70% of global cloud services rely on Amazon, Microsoft, and Google — a concentration she called “structurally dangerous.”
“When one major provider fails, entire sectors grind to a halt,” Crider said. She urged governments and businesses to invest in local and diversified cloud services to reduce future risk.
Companies urged to improve digital resilience
Cornell University professor Ken Birman said businesses relying on AWS share part of the blame. “Many organisations fail to design strong backup systems for their applications,” he said. Outages occur frequently, but few reach this scale.
Birman added that the technology to build secure and resilient systems already exists. “We know how to prevent failures like this,” he said. “But many companies prioritise speed and convenience over reliability.”
Legal and financial fallout looms
Accountability could end up in the courts. After last year’s CrowdStrike outage, Delta Airlines is still seeking over $500 million in damages. The airline had to manually restart 40,000 servers, causing several days of flight delays.
The AWS outage has reignited debate over whether the global internet relies too heavily on a handful of tech giants — and whether a single provider’s failure could again paralyze large parts of the digital economy.
