Within a few hours, the malfunctions began hitting customers of Amazon Web Services, the company’s cloud-computing unit. Customers of the Amazon-owned Ring security camera service couldn’t log in or watch video. Users struggled to operate their iRobot vacuum cleaners because the outage affect the iRobot Home App. And media companies, including The Washington Post (privately owned by Amazon chief executive Jeff Bezos), experienced publishing system outages.
Amazon acknowledged that the system failure was exacerbated by the co-dependencies its various services have on one another. The company had been trying to add capacity to its Amazon Kinesis service that customers use to process real-time data including video, audio and application logs. To resolve the issue, Amazon needed to restart a piece of its system it described as “many thousands of servers,” a lengthy process that had to be done gradually. But because other Amazon cloud services rely on Kinesis, including its