AWS Cloud
AWS Cloud
Get Started with Big Data on AWS

Amazon Web Services provides a broad range of services to help you build and deploy big data analytics applications quickly and easily. AWS gives you fast access to flexible and low cost IT resources, so you can rapidly scale virtually any big data application including data warehousing, clickstream analytics, fraud detection, recommendation engines, event-driven ETL, serverless computing, and internet-of-things processing. With AWS you don’t need to make large upfront investments in time and money to build and maintain infrastructure. Instead, you can provision exactly the right type and size of resources you need to power big data analytics applications. You can access as many resources as you need, almost instantly, and only pay for what you use.

Adobe Flash Player or a modern browser is required to view videos on this site.

Big Data Video Still
3:01
Big Data on Amazon Web Services

Build virtually any big data analytics application; support any workload regardless of volume, velocity, and variety of data. With 50+ services and hundreds of features added every year, AWS provides everything you need to collect, store, process, analyze, and visualize big data on the cloud.

Managed, distributed computing for big data

Amazon EMR

Amazon EMR

Easily provision a fully managed Hadoop framework in minutes. Scale your Hadoop cluster dynamically and pay only for what you use. Run popular frameworks such as Apache Spark, Apache Tez, and Presto. Learn more »

BigData_Elasticsearch

Amazon Elasticsearch Service

Setup and deploy an Elasticsearch cluster in minutes, using a web-based console. Seamlessly run your existing Elasticsearch applications using the Elasticsearch open-source API. Learn more »

Feature_Athena

Amazon Athena

Easily analyze petabytes of data in Amazon S3 using ANSI SQL. With Amazon Athena, there are no clusters or data warehouses to manage, so you can start analyzing data immediately. You don’t even need to load your data into Athena, it works directly with data stored in S3.
Learn more »

 

Big-Data-Redesign_logo-Yelp

Yelp runs hundreds of Amazon EMR jobs to process over 30 terabytes of data every day. Using Amazon EMR, Yelp was able to save $55,000 in upfront hardware costs and get up and running in a matter of days not months.
Read the case study »


Powerful services to load and analyze streaming data 

Amazon EMR

Easily load massive volumes of streaming data into AWS. Enable near real-time big data analytics with existing BI tools and dashboards you’re using today.
Learn more »

 

BigData_Develop

Build your own custom applications that process or analyze streaming data. Continuously capture and store terabytes of data per hour.
Learn more »

BigData_Analyze

Easily analyze streaming data with standard SQL. Kinesis Analytics takes care of everything required to run your queries continuously and scales automatically to match your requirements.
Learn more »

 

Big-Data-Redesign_logo-Hearst-Corporation

With over 250 digital properties worldwide, including television stations and popular publications such as Cosmopolitan and Car & Driver, Hearst corporation uses Amazon Kinesis to deliver real-time insights to data scientists and business stakeholders.
Watch the video »


Secure, durable, highly scalable storage with a broad set of engines 

Amazon EMR

Amazon S3

Amazon S3 provides developers and IT teams with a highly reliable, secure, and scalable object storage for all your data, big or small.
Learn more »

 

BigData_NoSQL

Amazon DynamoDB

A fully managed, fast, and flexible NoSQL database service for all applications – mobile, web, gaming, ad tech, IoT, and more – that need consistent, single-digit millisecond latency at any scale.
Learn more »

 

 

BigData_DynamoDB-TItan

Amazon DynamoDB for Titan

Easily manipulate graphs at massive scale in AWS. Build your graph database using Titan and let DynamoDB handle the performance, scalability, and operational management of storing big data.
Learn more »

 

Big-Data_NoSQL_Hbase1

Apache HBase is a petabyte-scale, strictly consistent, open-source NoSQL database. Tight integration with the Apache Hadoop ecosystem allows you to combine big data analytics with fast data access. Easily create managed HBase clusters with Amazon EMR. Learn more »

BigData_Aurora

Amazon Aurora is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. Deliver up to 5x the throughput of standard MySQL running on the same hardware. Learn more »

BigData_Relational-Databases

Amazon RDS

Easily setup, operate, and scale a relational database in the cloud. Choose from six familiar database engines, including Oracle, Microsoft SQL Server, PostgreSQL, MySQL and MariaDB.
Learn more »

Big-Data-Redesign_logo-Airbnb

Airbnb is a community marketplace that allows property owners and travelers to connect with each other for the purpose of renting unique vacation spaces. Airbnb uses Amazon S3 to house backups and static files, including 10TB of pictures. Airbnb also moved its main MySQL database to Amazon RDS, minimizing the time spent on database administrative tasks.
Watch the video »


Fully managed, petabyte-scale, data warehouse 

Amazon EMR

Easily provision, configure and deploy a data warehouse within minutes. Amazon Redshift handles all the work needed to manage, monitor and scale it. Query & analyze big data for less than $1,000 per TB per year. And with Redshift Spectrum, you can also run SQL queries directly against exabytes of unstructured data in Amazon S3.
Learn more »

Big-Data-Redesign_logo-Nasdaq

Nasdaq achieved faster, richer analytics and data warehousing capabilities while reducing costs by 57% by shifting to Amazon Redshift.
Watch the video »


Fast, cloud-powered BI for 1/10th the cost of traditional solutions

Amazon EMR

Deliver rich BI functionality to everyone in your organization. Enable employees to easily build visualizations, perform ad-hoc analysis, and quickly get business insights from big data. Perform advanced calculations and render visualizations rapidly. Learn more »

Amazon QuickSight uses SPICE – a Super-fast, Parallel, In-memory optimized Calculation Engine

Get suggestions for the best possible visualizations, optimized for your data, to help you get quick, actionable business insights.

QS_visualize-minutes_250x120
QS_visualize-minutes_big
QS_dynamically-optimized-graphics_250x120
QS_dynamically-optimized-graphics_big
QS_smart-visualization_250x120
QS_smart-visualizations_big

Cloud-native machine learning and deep learning technologies to address a broad set of use cases and needs

Amazon Lex

Amazon Lex provides the advanced deep learning functionalities of automatic speech recognition (ASR) and natural language understanding (NLU) to enable you to build applications with life-like conversational interactions.
Learn more »

 

Amazon Polly

Amazon Polly is a service that turns text into lifelike speech. Polly lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products.
Learn more »

Amazon Rekognition

Amazon Rekognition is a service that makes it easy to add image analysis to your applications. With Rekognition, you can detect objects, scenes, and faces in images. You can also search and compare faces.

Learn more »

 

Amazon Machine Learning

Amazon Machine Learning is a managed service that provides visualization tools and wizards that guide you through the process of creating machine learning (ML) models without having to learn complex ML algorithms and technology. 
Learn more »

Apache MXNet

Running Apache MXnet on AWS provides a highly scalable, flexible and fast model training experience for developers using deep learning.    

Learn More >>

R-Divider_Ohio-Health_Logo

"We are excited about utilizing evolving speech recognition and natural language processing technology to enhance the lives of our customers.

– Michael Krouse, Senior VP Operational Support and CIO, OhioHealth


Easily and securely connect devices to the cloud. Scale to billions of devices and trillions of messages.

Amazon EMR

Easily and securely connect devices to the cloud. Enable applications to interact with devices even when they are offline. Use AWS services to gather, process and act on data, without having to manage any infrastructure. Learn more »

Amazon EMR

Run local compute, messaging & data caching for connected devices in a secure way. With AWS Greengrass, connected devices can run AWS Lambda functions, keep device data in sync, and communicate with other devices securely – even when not connected to the Internet. Learn more »

How AWS IoT Works

How AWS IoT Works
Big-Data-Redesign_IoT-how-it-works_big

Run code without thinking about servers. Pay for only the compute time you consume.

 

BigData_Serverless-Compute

Run code without provisioning or managing servers. Pay only for the compute time you consume. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability.
Learn more »

 

Adobe Flash Player or a modern browser is required to view videos on this site.

Big-Data_Lambda-thumbnail
Big-Data-Redesign_logo-Zillow

Zillow uses AWS Lambda and Amazon Kinesis to track a subset of mobile metrics in real time. With Kinesis and Lambda, Zillow was able to develop and deploy a cost effective solution in two weeks.
Watch the video »


Powerful Compute Instances for Big Data Analytics

Amazon EMR

Compute-optimized instances, such as C4 instances, feature the highest performing processors and the lowest price/compute performance in EC2. With support for clustering C4 instances are ideal for batch processing, distributed analytics, high performance science and engineering applications, ad serving, MMO gaming, and video-encoding. Learn more »

 

Big-Data-Redesign_Dense-Storage

Featuring up to 48 TB of HDD-based local storage, dense storage instances deliver high throughput, and offer the lowest price per disk throughput performance on EC2. Ideal for Massively Parallel Processing (MPP), Hadoop, distributed computing, distributed file systems, network file systems, and big data processing applications. Learn more »

Big-Data-Redesign_Meomory-Optimized

Memory optimized instances have the lowest cost per GiB of RAM among Amazon EC2 instance types. These instances are ideal for high performance databases, distributed memory caches, in-memory analytics, genome assembly and analysis, and other large enterprise applications. Learn more »

 

Big-Data-Redesign_GPU-Optimized

GPU instances are ideal to power graphics-intensive applications such as 3D streaming, machine learning, and video encoding. Each instance features high-performance NVIDIA GPUs with an on-board hardware video encoder designed to support up to eight real-time HD video streams (720p@30fps) or up to four real-time full HD video streams (1080p@30fps). Learn more »

 


Simple, fast, secure data migration services to and from AWS 

Amazon EMR

AWS Direct Connect

Reduce your bandwidth costs, transfer data to and from AWS directly. Establish private connectivity between AWS and your datacenter, office, or colocation environment. Learn more »

 

BigData_Data-Transfer

AWS Snowball

Avoid high network costs, long transfer times, and security concerns with Snowball. A petabyte-scale data transport appliance to securely transfer large amounts of data at 1/5 the cost of high-speed internet.
Learn more »

 

BigData_Database-Migration

AWS Database Migration Service

Migrate databases to AWS easily and securely. Start with just a few clicks in the AWS Management Console, then let AWS manage all the complexities of the migration process. Learn more »

HA_Storage-Gateway_2up

AWS Storage Gateway

Augment existing on-premises storage investments with the high scalability, extreme durability and low cost of AWS cloud storage.
Learn more »