0% found this document useful (0 votes)

66 views

Introduction To Analytics On AWS

The document introduces analytics capabilities on AWS. It discusses how customers want more value from data that is growing exponentially from diverse new sources. AWS offers scalable data lakes, purpose-built analytics services for performance and cost, and serverless and easy to use options. The document also summarizes several AWS analytics services, including Amazon Redshift, Amazon EMR, Amazon OpenSearch Service, Amazon Kinesis, Amazon MSK, and AWS Lake Formation.

Uploaded by

elleryodenwald696

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views

Introduction To Analytics On AWS

Uploaded by

elleryodenwald696

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Introduction to

Analytics on AWS

Lesly Reyes
Telco Solutions Architect

© 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates.
Customers want more value from their data

Growing From new Increasingly Used by Analyzed by many

Exponentially sources diverse many people applications

© 2022, Amazon Web Services, Inc. or its affiliates.

Modern data strategy in action

Machine
Learning Databases

Catalog People,
Data
Apps, and
Sources
Governance Devices

Data
Analytics
Lakes

© 2022, Amazon Web Services, Inc. or its affiliates. 3

AWS Analytics Pillars

Scalable data Purpose-built Serverless and Unified data Built-in machine

lakes for performance easy to use access, security, learning
and cost and governance

© 2022, Amazon Web Services, Inc. or its affiliates.

Broadest portfolio
of analytics tools

Amazon
S3

Unmatched durability, Built to store and retrieve

availability, and scalability any amount of data

© 2022, Amazon Web Services, Inc. or its affiliates.

The benefits of data lakes

Store all your data in open formats

Catalog
Cost-effectively scale storage to exabytes

Decouple storage from compute

Data lake Choice of analytical and ML engines

Process data in place

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
AMAZON
AMAZON AMAZON AMAZON AMAZON
OPENSEARCH
REDSHIFT ATHENA EMR KINESIS & MSK
SERVICE

Data Query all your Big data Log and search Real-time
warehousing data using SQL processing analytics analytics
or Python

© 2022, Amazon Web Services, Inc. or its affiliates.

AWS has the most serverless options for data analytics
in the cloud
AWS Glue Amazon Kinesis
Data integration, ETL, and Catalog Real-time analytics

Amazon Redshift Amazon MSK

Data warehousing Real-time analytics

AWS
Amazon EMR
Analytics Amazon QuickSight
Big data processing Visualization

Amazon Athena AWS Lake Formation

Interactive Analytics Data lake setup management and governance

© 2022, Amazon Web Services, Inc. or its affiliates.

Challenges of building and securing modern data lakes

Support updates Row-level Automatic storage

and deletes Fine-grained optimization
Secure sharing

© 2022, Amazon Web Services, Inc. or its affiliates.

Break down data silos

Extract, Visual data Data Data warehouse Federated

transform, load preparation replication to/from data lake query

© 2022, Amazon Web Services, Inc. or its affiliates.

aws.amazon.com/analytics

© 2022, Amazon Web Services, Inc. or its affiliates. 11

Analyze all your data

Amazon
Redshift Price-performance at any scale

THE BEST PRICE-PERFORMANCE

FOR CLOUD DATA WAREHOUSING

Easy, secure, and reliable

© 2022, Amazon Web Services, Inc. or its affiliates.

Fully managed and customizable

Amazon Latest open-source releases

EMR
RUN BIG DATA APPLICATIONS Automatically scale up and down
IN THE CLOUD

Best price-performance

© 2022, Amazon Web Services, Inc. or its affiliates.

Amazon EMR Studio for
interactive data analytics

Amazon
EMR Multiple deployment models

RUN BIG DATA APPLICATIONS

IN THE CLOUD

Amazon S3 data lake integration

© 2022, Amazon Web Services, Inc. or its affiliates.

Fully managed
Amazon
OpenSearch Log and search analytics
Service
SUCCESSOR TO
AMAZON ELASTICSEARCH SERVICE
Cost effective

© 2022, Amazon Web Services, Inc. or its affiliates.

Kinesis Data Streams

Amazon Kinesis Data Analytics

Kinesis
COLLECT, PROCESS, AND Kinesis Video Streams
ANALYZE VIDEO AND DATA
STREAMS IN REAL TIME

Kinesis Data Firehose

© 2022, Amazon Web Services, Inc. or its affiliates.

Compatible

Amazon Fully managed

MSK Highly available

FULLY MANAGED, HIGHLY
AVAILABLE, AND SECURE

Secure

© 2022, Amazon Web Services, Inc. or its affiliates.

AWS Lake Formation
BUILD SECURE DATA LAKES

Amazon S3
Portfolio of integrated Lake Formation
analytics tools

Simplified
ingest and
cleaning
Amazon Athena Amazon QuickSight AWS Glue Blueprints ML Transform
Cost effective, durable
data lake storage with
global replication capabilities

Amazon Redshift AWS Glue Acid Transactions Storage

Reliable and Optimization
optimized
data lakes

Amazon SageMaker Amazon EMR Catalog Permissions

© 2022, Amazon Web Services, Inc. or its affiliates.

Amazon Amazon Amazon
Redshift EMR Athena

Simplify security
management with
AWS Lake Formation Data Lake
Admin Lake Access Data
Formation Control Catalog

Amazon S3 data lake storage

Data Lake

© 2022, Amazon Web Services, Inc. or its affiliates.

Auto-scaling and serverless

Amazon Internal and/or external users

QuickSight
CLOUD-NATIVE BI SOLUTION
FOR ILLUMINATING
Deeply integrated with AWS services
ORGANIZATIONAL INSIGHTS

Augmented insights on-demand

© 2022, Amazon Web Services, Inc. or its affiliates.

Integrate data faster

AWS Glue Automate at scale

SIMPLE, SCALABLE,
AND SERVERLESS

No servers to manage

© 2022, Amazon Web Services, Inc. or its affiliates.

AWS Glue: Key Capabilities
SERVERLESS DATA INTEGRATION SERVICE

Scalable Data
Integration Engine

Built-in data transforms

Execution engine

Monitor

© 2022, Amazon Web Services, Inc. or its affiliates.

AWS Glue: Key Capabilities
SERVERLESS DATA INTEGRATION SERVICE

Scalable Data Centralized and Unified

Integration Engine Data Governance

Built-in data transforms Glue data catalog

Glue crawlers
Execution engine

Monitor Lake formation

© 2022, Amazon Web Services, Inc. or its affiliates.

AWS Glue: Key Capabilities
SERVERLESS DATA INTEGRATION SERVICE

Scalable Data Centralized and Unified Connect and

Integration Engine Data Governance Ingest Data

Built-in data transforms Glue data catalog Glue connectors

Glue crawlers Glue connector marketplace

Execution engine

Lake formation Variety of interfaces

Monitor

© 2022, Amazon Web Services, Inc. or its affiliates.

AWS Glue: Key Capabilities
SERVERLESS DATA INTEGRATION SERVICE

Scalable Data Centralized and Unified Connect and User Productivity

Integration Engine Data Governance Ingest Data and Data Ops

Built-in data transforms Glue data catalog Glue connectors Persona specific tools

Glue crawlers Glue connector marketplace

Execution engine Productivity tools

Lake formation Variety of interfaces Data ops tools

Monitor

AWS Glue: Key Capabilities
SERVERLESS DATA INTEGRATION SERVICE

Scalable Data Centralized and Unified Connect and User Productivity

Integration Engine Data Governance Ingest Data and Data Ops

Built-in data transforms Glue data catalog Glue connectors Persona specific tools

Glue crawlers Glue connector marketplace

Execution engine Productivity tools

Lake formation Variety of interfaces Data ops tools

Monitor

Simple, instant start

Amazon Interactive, advanced analytics

Athena Open and flexible

QUERY ALL YOUR
DATA USING
SQL OR PYTHON

Cost-effective

Python Python
SQL SQL
INTERACTIVE,
SIMPLE, ADVANCED OPEN AND COST
INSTANT START ANALYTICS FLEXIBLE EFFECTIVE

Serverless, no setup Federated queries ANSI SQL, Apache Spark Pay only for what you use
across 35+ data stores
Instant start, SQL: Save on
optimized runtimes Use PySpark ecosystem Multiple formats, per-query costs
for fast results compression types, and through compression
Simplified notebooks on complex joins
Point to S3 and console for PySpark and data types Python: minimize idle
start querying compute charges

Amazon Amazon
DynamoDB EMR

Amazon Amazon
OpenSearch Aurora
Service Amazon S3

Amazon Amazon
Redshift SageMaker

AWS Glue Studio

ETL developer

Rich visual interface

250+ built-in transformations

Profile data to understand data

patterns and anomalies

Work on large datasets at scale

AWS Glue DataBrew

Business Analyst
Data Scientist

Run ETL jobs without writing code

Monitor thousands of jobs through

a single pane of glass

Distributed processes

Advanced transforms through

code snippets

AWS Glue Notebooks

Data engineer

Clean and normalize data with a rich

visual interface

Choose from 250+ built-in

transformations to automate tasks

Profile data to understand data

patterns and anomalies

Where to start

Thank you!
Thank you!
Lesly Reyes
reylesl@amazon.com

Cheat Sheet AWS Data Engineer Associate
No ratings yet
Cheat Sheet AWS Data Engineer Associate
117 pages
AWS Cloud Practitioner (CLF C02)
100% (1)
AWS Cloud Practitioner (CLF C02)
102 pages
2024 iot End Sem Exam Question Paper
100% (1)
2024 iot End Sem Exam Question Paper
2 pages
APC Building Data Lakes On AWS SG
No ratings yet
APC Building Data Lakes On AWS SG
187 pages
08 Databases
No ratings yet
08 Databases
63 pages
AWS Certified Cloud Practitioner 03-09-2021
100% (1)
AWS Certified Cloud Practitioner 03-09-2021
111 pages
AWS 05 DataLake
No ratings yet
AWS 05 DataLake
78 pages
ANT205 R Achieving Your Modern Data Architecture
No ratings yet
ANT205 R Achieving Your Modern Data Architecture
71 pages
Ppb1 Workshop Batch v2
No ratings yet
Ppb1 Workshop Batch v2
43 pages
Unite Real-Time and Batch Analytics With AWS Glue
No ratings yet
Unite Real-Time and Batch Analytics With AWS Glue
28 pages
Analytics Services v2
No ratings yet
Analytics Services v2
59 pages
Modernserverlessdatalak
No ratings yet
Modernserverlessdatalak
45 pages
Data Pipelines With AWS Glue (Level 200)
No ratings yet
Data Pipelines With AWS Glue (Level 200)
33 pages
DMWQ1D4S3T1 - Building Analytics at Scale With Amazon Athena
No ratings yet
DMWQ1D4S3T1 - Building Analytics at Scale With Amazon Athena
48 pages
AWS Data Lake
100% (1)
AWS Data Lake
104 pages
Data Lakes For Maximum Flexibility
No ratings yet
Data Lakes For Maximum Flexibility
29 pages
Big Data PDF
No ratings yet
Big Data PDF
18 pages
AWS+Data+Lake (1)
No ratings yet
AWS+Data+Lake (1)
118 pages
WhizCard CLF C01 06 09 2022
No ratings yet
WhizCard CLF C01 06 09 2022
111 pages
AWS Whitepaper
No ratings yet
AWS Whitepaper
31 pages
1 AWS Analytics and Data Lakes
No ratings yet
1 AWS Analytics and Data Lakes
15 pages
Data Lake On Aws
No ratings yet
Data Lake On Aws
29 pages
Aws Data Service Notes
No ratings yet
Aws Data Service Notes
9 pages
AWS Data Lake
No ratings yet
AWS Data Lake
87 pages
Data Engineering by AWS
100% (1)
Data Engineering by AWS
11 pages
AWS Services List and CLF02 Content - Services and Usage-1
No ratings yet
AWS Services List and CLF02 Content - Services and Usage-1
57 pages
Cheat Sheet AWS Solutions Architect Professional
No ratings yet
Cheat Sheet AWS Solutions Architect Professional
177 pages
AWS Glue for Handling Metadata - Analytics Vidhya
No ratings yet
AWS Glue for Handling Metadata - Analytics Vidhya
5 pages
Modernize Your Analyticsand Data Architecture
No ratings yet
Modernize Your Analyticsand Data Architecture
47 pages
Redshift-DA Handout
No ratings yet
Redshift-DA Handout
121 pages
Amazon Redshift - Analyze Data Across Your Lake House with Amazon Redshift
No ratings yet
Amazon Redshift - Analyze Data Across Your Lake House with Amazon Redshift
48 pages
Notes
No ratings yet
Notes
28 pages
AWS Data Analytics - Technical - Student
No ratings yet
AWS Data Analytics - Technical - Student
160 pages
PSO Data Analytics Day 1
100% (1)
PSO Data Analytics Day 1
106 pages
Database in AWS
No ratings yet
Database in AWS
24 pages
AWS White Paper
No ratings yet
AWS White Paper
6 pages
Aws
No ratings yet
Aws
9 pages
How To Build Data Pipelines On AWS - Reference Workflow
No ratings yet
How To Build Data Pipelines On AWS - Reference Workflow
26 pages
Building+serverless+analytics+pipelines+with+AWS+Glue+-+Tom+McMeekin-1
No ratings yet
Building+serverless+analytics+pipelines+with+AWS+Glue+-+Tom+McMeekin-1
39 pages
Modern Data Architectures Using The AWS WellArchitected Data Analytics Lens REPEAT ARC321-R2
100% (1)
Modern Data Architectures Using The AWS WellArchitected Data Analytics Lens REPEAT ARC321-R2
19 pages
BDC Output 10
No ratings yet
BDC Output 10
7 pages
AWS Services - Analytics and ML
No ratings yet
AWS Services - Analytics and ML
2 pages
AWS - 06 - Best Practice To Secure DataLake
No ratings yet
AWS - 06 - Best Practice To Secure DataLake
75 pages
Blair Layton: Transformation Business Development Manager Amazon Web Services
No ratings yet
Blair Layton: Transformation Business Development Manager Amazon Web Services
36 pages
Abd213 R Howtobuildadatalakewithawsgluedatacatalog 180208045612
No ratings yet
Abd213 R Howtobuildadatalakewithawsgluedatacatalog 180208045612
43 pages
AWS Cheatbook for Dummies
No ratings yet
AWS Cheatbook for Dummies
172 pages
Data Architecture On Aws Slides
No ratings yet
Data Architecture On Aws Slides
33 pages
Architecture
No ratings yet
Architecture
6 pages
6 +Athena,+QuickSight,+EMR
No ratings yet
6 +Athena,+QuickSight,+EMR
63 pages
AWS Data Lake
No ratings yet
AWS Data Lake
13 pages
Handout Accelerate Your Analytics and AI With Amazon SageMaker Lakehouse
No ratings yet
Handout Accelerate Your Analytics and AI With Amazon SageMaker Lakehouse
45 pages
AWS Quick Start - AWS Purpose-Built Database Strategy - Final
No ratings yet
AWS Quick Start - AWS Purpose-Built Database Strategy - Final
32 pages
SAA 03 Notes
No ratings yet
SAA 03 Notes
32 pages
Basic terms of DATA ENGINEERING
No ratings yet
Basic terms of DATA ENGINEERING
9 pages
AWS Certified Data Engineer
No ratings yet
AWS Certified Data Engineer
186 pages
REPEAT_3_Architecting_your_data_lake_with_SAP_on_AWS_ENT310-R3
No ratings yet
REPEAT_3_Architecting_your_data_lake_with_SAP_on_AWS_ENT310-R3
22 pages
Implementing Travel & Hospitality Data Mesh: AWS Reference Architecture
No ratings yet
Implementing Travel & Hospitality Data Mesh: AWS Reference Architecture
2 pages
WhizCard CLF C02 Cheat Sheet Nov 2024
No ratings yet
WhizCard CLF C02 Cheat Sheet Nov 2024
110 pages
AWS Certified Solutions Architect - Associate Exam Prep kit
From Everand
AWS Certified Solutions Architect - Associate Exam Prep kit
SUJAN
No ratings yet
Amazon Athena Query Design and Optimization: Definitive Reference for Developers and Engineers
From Everand
Amazon Athena Query Design and Optimization: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Amazon Redshift: Scalable Cloud Data Warehousing
From Everand
Mastering Amazon Redshift: Scalable Cloud Data Warehousing
Robert Johnson
No ratings yet
AWS SysOps Administrator Associate: From basic to advanced
From Everand
AWS SysOps Administrator Associate: From basic to advanced
Alex Carvalho
No ratings yet
Eucalyptus Cloud Platform PDF
No ratings yet
Eucalyptus Cloud Platform PDF
2 pages
"Business Models in Cloud Computing": Assignment-1
No ratings yet
"Business Models in Cloud Computing": Assignment-1
22 pages
CC (Paper 1)
No ratings yet
CC (Paper 1)
2 pages
Cloud Computing - Platform and Applications
No ratings yet
Cloud Computing - Platform and Applications
3 pages
Professional Cloud Architect Exam - Free Actual Q&As, Page 13 - ExamTopics
No ratings yet
Professional Cloud Architect Exam - Free Actual Q&As, Page 13 - ExamTopics
3 pages
Cloud Computing Lab Record
No ratings yet
Cloud Computing Lab Record
49 pages
Department of Information Technology
No ratings yet
Department of Information Technology
12 pages
Avi Vantage Platform Data Sheet
No ratings yet
Avi Vantage Platform Data Sheet
6 pages
P2P - Activity 1
No ratings yet
P2P - Activity 1
7 pages
Azure PPT by KH
No ratings yet
Azure PPT by KH
11 pages
Kafka Architecture
No ratings yet
Kafka Architecture
1 page
Implementing A NetSupport Connectivity Server Cluster For Load Balancing and Failover
No ratings yet
Implementing A NetSupport Connectivity Server Cluster For Load Balancing and Failover
8 pages
AWS Certified Solutions Architect - Associate SAA-C03 Exam - Free Exam Q&As, Page 1 - ExamTopics
No ratings yet
AWS Certified Solutions Architect - Associate SAA-C03 Exam - Free Exam Q&As, Page 1 - ExamTopics
97 pages
Big Data Notes 2025
No ratings yet
Big Data Notes 2025
13 pages
Autoscaling PDF
No ratings yet
Autoscaling PDF
3 pages
Dell Technologies Hyperconverged Solutions: Modernize Your Infrastructure and Accelerate IT Transformation
No ratings yet
Dell Technologies Hyperconverged Solutions: Modernize Your Infrastructure and Accelerate IT Transformation
21 pages
AWS Mock Test-11
No ratings yet
AWS Mock Test-11
33 pages
Cloud Computing - Syllabus
No ratings yet
Cloud Computing - Syllabus
2 pages
Tec L
No ratings yet
Tec L
34 pages
22 Cloud Storage
No ratings yet
22 Cloud Storage
2 pages
Module 12 - Examen Test D
No ratings yet
Module 12 - Examen Test D
132 pages
Pocomo Projected Collaboration Using Mobile Device
No ratings yet
Pocomo Projected Collaboration Using Mobile Device
20 pages
Hadoop Map Reduce
No ratings yet
Hadoop Map Reduce
53 pages
Which of The Following Is True Regarding SQS Message?: Correct
No ratings yet
Which of The Following Is True Regarding SQS Message?: Correct
66 pages
Postgraduate PG - Master Computer Applications Mca - Semester 3 - 2024 - May - Cloud Computing 2020 Pattern
No ratings yet
Postgraduate PG - Master Computer Applications Mca - Semester 3 - 2024 - May - Cloud Computing 2020 Pattern
2 pages
Deckhouse For - ... - (English Version)
No ratings yet
Deckhouse For - ... - (English Version)
18 pages
AWS Cloud Security Cheat Sheet: Passwords Policy - IAM Logging
No ratings yet
AWS Cloud Security Cheat Sheet: Passwords Policy - IAM Logging
1 page
Driving Continuous Evolution of Azure Environments Presentation
No ratings yet
Driving Continuous Evolution of Azure Environments Presentation
45 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.