0% found this document useful (0 votes)
28 views16 pages

Data Engineer - Copie

The document outlines a training schedule for a Data Engineer from May to November 2024, covering essential skills such as programming in Python, SQL, cloud fundamentals, ETL processes, and big data engineering using tools like Spark and Apache Kafka. It includes learning objectives for each month, focusing on data transformation, data warehousing, and data visualization with tools like Tableau and Power BI. The curriculum emphasizes hands-on experience with various data processing libraries and cloud platforms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views16 pages

Data Engineer - Copie

The document outlines a training schedule for a Data Engineer from May to November 2024, covering essential skills such as programming in Python, SQL, cloud fundamentals, ETL processes, and big data engineering using tools like Spark and Apache Kafka. It includes learning objectives for each month, focusing on data transformation, data warehousing, and data visualization with tools like Tableau and Power BI. The curriculum emphasizes hands-on experience with various data processing libraries and cloud platforms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

DATA ENGINEER

May 2024 June 2024 June 2024 July 2024


• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker
DATA ENGINEER
May 2024 June 2024 June 2024 July 2024
• Learn Programming • Learn SQL Programming in • Cloud Fundamentals • ETL using Python/Spark
1. Basic Python detail (Azure / • Data Processing
2. Working with Data/Files • Rank Window Functions AWS / GCP) Libraries /
• Aggregations • Hadoop Ecosystem Constructs (NumPy,
• Learn Basics of Relational • Data wangling and (HDFS, Pandas,
Database Analysis MapReduce, YARN, RDD, Spark, Dataframe)
1. SQL • Data warehouse and Sqoop, Hive
Server/MySQL/PostgreSQL concepts etc.)
• Data Modeling for
warehouse

September 2024 August 2024 August 2024 July 2024


• Data Transformation Tools • Handling Data streaming • Data Engineering (AWS / • Big Data Engineering
• DBT • Processing Streaming GCP / Using Spark
Data Azure) Optimization in Spark
• Apache Kafka Links • Cloud Data Warehouse • Workflow Schedules
(Snowflake / Databrics / (Airflow)
Redshift)

September 2024 October 2024 October 2024 November 2024


• Dashboarding & • Docker, DataOps, Azure • Data Lakehouse, Data • Data Quality, Data
Visualization DevOps Mesh, Data • Fabrics Observability, • Data
• Tableau / Power BI / Governance
Looker

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy