AWS outage impacts Ring, Netflix, and Amazon deliveries

Amazon AWS in the US-EAST-1 Region is suffering an outage that affected numerous online services, including Ring, Netflix, Amazon Prime Video, and Roku.

The ongoing outage started at approximately 12 PM EST and is caused by problematic network equipment affecting the US-EAST-1 AWS region, which feeds a good portion of the connectivity for people in the northeastern part of the United States.

This outage disrupted streaming through Netflix, Amazon Prime, and Roku, and continues to affect users of Ring devices who are unable to connect to their cameras.

Outage affecting Ring services
Outage affecting Ring services

CNBC also reports that Amazon employees posted to Reddit that they could not access their internal apps required to scan packages, access delivery routes, or see their upcoming schedule.

Amazon employee on reddit #1

Amazon employee on reddit #2

The latest update regarding the outage was shared by Amazon at 7:35 PM EST, stating that their network devices issues have been resolved and that services are now recovering.

[12:34 PM PST] We continue to experience increased API error rates for multiple AWS Services in the US-EAST-1 Region. The root cause of this issue is an impairment of several network devices. We continue to work toward mitigation, and are actively working on a number of different mitigation and resolution actions. While we have observed some early signs of recovery, we do not have an ETA for full recovery. For customers experiencing issues signing-in to the AWS Management Console in US-EAST-1, we recommend retrying using a separate Management Console endpoint (such as https://us-west-2.console.aws.amazon.com/). Additionally, if you are attempting to login using root login credentials you may be unable to do so, even via console endpoints not in US-EAST-1. If you are impacted by this, we recommend using IAM Users or Roles for authentication. We will continue to provide updates here as we have more information to share.

[2:04 PM PST] We have executed a mitigation which is showing significant recovery in the US-EAST-1 Region. We are continuing to closely monitor the health of the network devices and we expect to continue to make progress towards full recovery. We still do not have an ETA for full recovery at this time.

[2:43 PM PST] We have mitigated the underlying issue that caused some network devices in the US-EAST-1 Region to be impaired. We are seeing improvement in availability across most AWS services. All services are now independently working through service-by-service recovery. We continue to work toward full recovery for all impacted AWS Services and API operations. In order to expedite overall recovery, we have temporarily disabled Event Deliveries for Amazon EventBridge in the US-EAST-1 Region. These events will still be received & accepted, and queued for later delivery.

[3:03 PM PST] Many services have already recovered, however we are working towards full recovery across services. Services like SSO, Connect, API Gateway, ECS/Fargate, and EventBridge are still experiencing impact. Engineers are actively working on resolving impact to these services.

[4:35 PM PST] With the network device issues resolved, we are now working towards recovery of any impaired services. We will provide additional updates for impaired services within the appropriate entry in the Service Health Dashboard.

Today's outage follows a long string of other events since 2011, including a large-scale incident that affected the US-EAST-1 Region in November 2020, bringing down a long list of high-profile sites and online services after Amazon's Kinesis service for real-time processing of streaming data experienced issues.

One year before, in September 2019, a power outage at the AWS US-EAST-1 data center in North Virginia caused data loss for Amazon customers who did not have working backups to restore their files.

In February 2017, a massive Amazon's S3 (Simple Storage Service) outage took down millions of small and high-profile sites and app backends, including Adobe's apps and services, Docker, Giphy, Hacker News, IFTTT, Mailchimp, Medium, Quora, Signal, Slack, Trello, Twilio, and Twitch.

Update 12/7/21 6:25 PM EST  - Added further updates from AWS status page. Added final update

Related Articles:

Launch your cloud career with this 67 hours AWS training course deal

Boost your Amazon Web Services skills with this training bundle deal

Microsoft 365 hit by new outage causing connectivity issues

Master Amazon Web Services with this training bundle deal

GitHub reveals reason behind last week’s string of outages