data-analytics

  • Amazon EMR is a managed cluster and serverless solution that can make it more efficient to run big data frameworks, such as [[Apache Hadoop]] and [[Apache Spark]], on Amazon Web Services (AWS) to process and analyze vast amounts of data.

  • In this course, you will learn the benefits and technical concepts of Amazon EMR.

  • If you are new to the service, you will learn how to start using Amazon EMR through a demonstration using the AWS Management Console and AWS Command Line Interface (AWS CLI).

  • You will learn about the native architecture and how the built-in features can help you process data for analytics purposes and business intelligence workloads.

What does Amazon EMR do?

  • With Amazon EMR, you can process vast amounts of data efficiently using Apache Hadoop and services offered by Amazon Web Services (AWS).

  • Use Amazon EMR to run large-scale distributed data processing jobs, interactive queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Amazon EMR automates time-consuming tasks like setup, tuning, monitoring, and capacity planning.

How Amazon EMR works

![[Product-Page-Diagram_Amazon-EMR.png]]

Amazon EMR is the industry-leading cloud big data solution for petabyte-scale data processing, interactive analytics, and ML using open-source frameworks.

  • Amazon EMR Serverless is a serverless option for Amazon EMR that makes it easy for data analysts and engineers to run open-source big data analytics frameworks without configuring, managing, and scaling clusters or servers.
  • You can use Amazon Simple Storage Service (Amazon S3) to store data in data lakes.
  • Then, you can analyze and derive insights from your data using dashboards and visualizations to perform big data processing, real-time analytics, and machine learning to guide better decisions.

![[Product-Page-Diagram_Amazon-EMR-Serverless.jpg]]

Hi