AWS re:Invent 2014

(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR and Analyzing Results with Amazon Redshift | AWS re:Invent 2014


Listen Later

We will explore the strengths and limitations of Hadoop for analyzing large data sets and review the growing ecosystem of tools for augmenting, extending, or replacing Hadoop MapReduce. We will introduce the Amazon Elastic MapReduce (EMR) platform as the big data foundation for Hadoop and beyond by providing specific examples of running Machine Learning (Mahout), Graph Analytics (Giraph), and Statistical Analysis (R) on EMR. We will discuss also big data analytics and visualization of results with Amazon Redshift + third party business intelligence tools, as well as typical end-to-end Big Data workflow on AWS.
We will conclude with real-world examples from ICAO of Big Data analytics for aviation safety data on AWS. The integrated Safety Trend Analysis and Reporting System (iSTARS) is a web based system linking a collection of safety datasets and related web application to perform online safety and risk analysis. It uses AWS EC2, S3, EMR and related partner tools for continuous data aggregation and filtering.
...more
View all episodesView all episodes
Download on the App Store

AWS re:Invent 2014By Amazon Web Services