Amazon launched its Simple Storage Service (S3) in 2006 and uses it to track over 560 billion objects stored at more than 200 shipping centers worldwide. The service was developed to cope with big data problems in dynamic conditions and has been offered to other businesses. Netflix now uses it to store all its movies. Drop Box uses it as cloud backup behind its cloud sharing and synching service.
S3 was upgraded in 2009 with the application of the Elastic MapReduce service using Hadoop clusters to better manage big data situations. It was first applied within Amazon to identify and segregate pilferable items that must be stored in high-value cages. Defining these items and related aspects, such as bulkiness, is an oft-repeated chore. Changing theft patterns and item turnover also demands flexible management which the Elastic MapReduce service provides. An Amazon team now spins up an EMR cluster twice an hour to screen fifty million items for high-value entries to add to a 1.5 billion item catalog. What once took days to weeks now takes less than an hour and costs much less.
Hadoop clusters have been called the Swiss army knife of the 21st century because it can deal with varied big data problems. Yelp, a local review site, uses them to find ways to improve the customer experience for 50 million visitors a month. It only pays for clusters while they are being used and in an average week the eighty engineers that work there will spin up some 250 clusters for site jobs. Cycle Computing, an Amazon partner, will spin up a cluster with thirty thousand cores that would cost $18 million the old way, but only $1300 an hour for a few hours of cluster use because you only have to pay for them while they are being used. What problems could your developers solve if every one of them had access to this type supercomputer?
Alyssa Henry is VP, AWS Storage Services, Amazon.com. She leads Amazon Web Services' Storage Services: Amazon Simple Storage Service (S3), Amazon Glacier, Amazon Elastic Block Storage (EBS), AWS Storage Gateway, and AWS Import/Export. This includes responsibility for software development, operations, inbound and outbound product management, and the P&L for each business. She has led most of these services from conception, rapidly scaling the teams, the software, and the businesses through hyper-growth. She helped AWS maintain its first mover advantage by championing and building innovative new storage services like Amazon Glacier. Previously she was a Product Unit Manager at Microsoft Corporation from 1994 to 2006. She has a 1992 BS in Mathematics Applied Science with a Specialization in Computing from the University of California at Los Angeles.
This free podcast is from our Web 2.0 Conference series.