-
Amazon EMR is a managed cluster and serverless solution that can make it more efficient to run big data frameworks, such as [[Apache Hadoop]] and [[Apache Spark]], on Amazon Web Services (AWS) to process and analyze vast amounts of data.
-
In this course, you will learn the benefits and technical concepts of Amazon EMR.
-
If you are new to the service, you will learn how to start using Amazon EMR through a demonstration using the AWS Management Console and AWS Command Line Interface (AWS CLI).
-
You will learn about the native architecture and how the built-in features can help you process data for analytics purposes and business intelligence workloads.
What does Amazon EMR do?
-
With Amazon EMR, you can process vast amounts of data efficiently using Apache Hadoop and services offered by Amazon Web Services (AWS).
-
Use Amazon EMR to run large-scale distributed data processing jobs, interactive queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.
Amazon EMR automates time-consuming tasks like setup, tuning, monitoring, and capacity planning.
How Amazon EMR works
![[Product-Page-Diagram_Amazon-EMR.png]]
Amazon EMR is the industry-leading cloud big data solution for petabyte-scale data processing, interactive analytics, and ML using open-source frameworks.
- Amazon EMR Serverless is a serverless option for Amazon EMR that makes it easy for data analysts and engineers to run open-source big data analytics frameworks without configuring, managing, and scaling clusters or servers.
- You can use Amazon Simple Storage Service (Amazon S3) to store data in data lakes.
- Then, you can analyze and derive insights from your data using dashboards and visualizations to perform big data processing, real-time analytics, and machine learning to guide better decisions.
![[Product-Page-Diagram_Amazon-EMR-Serverless.jpg]]
Hi