0% found this document useful (0 votes)
33 views

Data Engineer (Azure & Fabric)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Data Engineer (Azure & Fabric)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Data

Engineer
JOB ORIENTED PROGRAM
Become a Data Engineer with Top Trending Tools
and Land High Paying Jobs at Top Companies

Azure Kafka

Fabric
Snowflake

Apache
Airflow

Databricks

+1 415-481-4467 (US) +91-7793068396 (IND)


Index
About Prepzee 02

About the Program 05

Program Features 06

Roadmap 07

Tools Covered 08

Labs to Perform 09

Unlock Bonuses worth $600 10

Data Engineer Curriculum 12

Projects 29

Placement Support 31

Learn from Industry Experts 33

What Our Learners Have To Say 35


2

About P
Prepzee
repzee
Our Mission & Impact

The More You L E A R N


The More You E A R N

100+ 500+
Training Programs Top Educators

9.1/10 100K+
Avg. Rating Hour Training
Delivered
3

Mentors from brands like

We are featured In
Jio News

Indian News Network.net


4

Featured In

Click to read the whole article


Apache
Airflow
5

About the Program Databricks

The Data Engineer Job Oriented Program is a comprehensive


Snowflake
training program that will teach you the skills you need for a
variety of Data Engineer roles.

This program is designed to help you prepare for your dream Fabric

job in the most in-demand Data industry by using top industry


tools such as Microsoft Fabric, Snowflake, Airflow, Kafka,
Spark, Eventstreams, OneLake, and T-SQL to create data Kafka

pipelines using Azure Cloud and PySpark.

We will cover the most in-demand industry projects and hands-


on labs, and we will help you with your CV, LinkedIn, and job
interview preparation. You will have the opportunity to gain
Azure
hands-on experience with Cloud-based Data Solutions by
working on projects under the one-on-one guidance of
experienced Data Engineer Professionals.

In this comprehensive program, you will acquire the skills


required for various Data Engineer Roles, including Azure Data
Engineer, Data Integration Specialist, Cloud Data Warehouse
Engineer, Data Consultant, Data Architect, and more.

Also, you can clear the Azure DP-700 certification, and we will
provide you with the necessary study materials and
exam preparation kit so you can pass it. This will
give you a significant advantage when you start
applying for jobs.

To learn more about the Data Engineer Job


Oriented Program and to enroll, schedule a free
call today.
6

Program Features

100+ 80+
Hours Hands-on &
Hours of Live Training
Exercises

8+ 24*7
Projects & Case Technical Support
Studies

Top 1%
Industry Experts Lifetime live
Training Access

Need More Info? Chat with Us


7

ROADMAP
Python for Data
Engineering :
1 Used for building data pipelines,
processing data, and automating
All about Data warehousing : tasks.
Understand the Star Schema,
Snowflake Schema, Dimension 2
tables and more.

Data Engineering with


Azure & Fabric:
3 Building scalable pipelines and
analytics workflows using
cloud-based tools for data
PySpark/Databricks:
processing and storage.
Used for distributed data
processing and analytics at scale
with Spark, leveraging Databricks 4
for seamless collaboration and
cloud integration.

Snowflake :
It’s a Data Cloud Platform, Used
5 for Compute.

Orchestration with Airflow


It’s Orchestrator to Run Pipeline
6

Kafka Streaming :
It’s a Queue Service. It’s
7 distributed Streaming Technology
and used by 70% of Fortune 500
Companies

CV , Interview Prep & Linkedin Profile Update


8

Tools Covered

One lake Fabric Event house

KQL Fabric
Data Warehouse

Airflow

Fabric
Kafka
Data Factory

Azure

Need More Info? Chat with Us


9

Labs to Perform
LAB 1 - Data Exploration and Transformation in Microsoft Fabric.

LAB 2 - Use of Snowflake for data storage, querying, and managing scalable
cloud data warehousing solutions.

LAB 3 - Configuring Single Node Single Cluster in Kafka

LAB 4 - Use of OneLake for unified data storage, management, and seamless
integration across Microsoft Fabric services.

Use of Fabric Data Factory to design, automate, and manage data


LAB 5 -
pipelines for seamless data integration and transformation.

LAB 6 - Explore Transform and Load Data into the Data Warehouse using Spark

LAB 7 - Ingest and Load Data into the Data Warehouse

LAB 8 - Transform Data with Azure Data Factory

LAB 9 - Real Time Stream Processing with Stream Analytics

LAB 10 - Create a Stream Processing Solution with Event Hub and Databricks

Need More Info? Chat with Us


10

Unlock Bonuses worth $600

Bonus 1 Bonus 2

AWS Cloud Linux


Practitioner Fundamentals
Certification Course
FREE COURSE FREE COURSE

Bonus 3

Designing Data
Intensive Applications
FREE PLAYBOOK

Need More Info? Chat with Us


11

Unlock Bonuses worth $600


Bonus 4
Azure DP 203 Master
Cheat Sheet
FREE CHEAT SHEET

Bonus 5

Playbook of 97 Things
Every Data Engineer
should Know
FREE PLAYBOOK

Need More Info? Chat with Us


12

Data Engineer Curriculum

Module 1 – Python for Data Engineering

1.1: Overview of Python


1.2: Different Applications where Python is Used
1.3: Values, Types, Variables
1.4: Operands and Expressions
1.5: Conditional Statements
1.6: Loops
1.7: Command Line Arguments
1.8: Writing to the Screen
1.9: Python files I/O Functions
1.10: Numbers
1.11: Strings and related operations
1.12: Tuples and related operations
1.13: Lists and related operations
1.14: Dictionaries and related operations
1.15: Sets and related operations

Hands On:
• Creating “Hello World” code
• Demonstrating Conditional Statements
• Demonstrating Loops

Need More Info? Chat with Us


13

• Tuple - properties, related operations, compared with list


• List - properties, related operations
• Dictionary - properties, related operations
• Set - properties, related operations

Module 2 – Functions, OOPS and Modules in Python

2.1: Functions
2.2: Function Parameters
2.3: Global Variables
2.4: Variable Scope and Returning Values
2.5: Lambda Functions
2.6: Object-Oriented Concepts
2.7: Standard Libraries
2.8: Modules Used in Python
2.9: The Import Statements
2.10: Module Search Path
2.11: Package Installation Ways

Hands On:
• Functions - Syntax, Arguments, Keyword Arguments, Return Values
• Lambda - Features, Syntax, Options, Compared with the Functions
• Sorting - Sequences, Dictionaries, Limitations of Sorting
• Errors and Exceptions - Types of Issues, Remediation
• Packages and Module - Modules, Import Options, sys Path

Need More Info? Chat with Us


14

Module 3 – Introduction to Cloud Computing and


Microsoft Azure

3.1: Introduction to cloud computing


3.2: Types of Cloud Models
3.3: Types of Cloud Service Models
3.4: IAAS
3.5: SAAS
3.6: PAAS
3.7: Creation of Microsoft Azure Account
3.8: Microsoft Azure Portal Overview

Module 4 – Azure Data Factory

4.1: Introduction to Azure Data Factory (ADF)


4.2: Creating and Managing Pipelines
4.3: Configuring Linked Services
4.4: Configure integration between ADF and external services
4.5: Setting up Integration Runtime (IR)
4.6: Building and Deploying Mapping Data Flows

Hands On:
• Create an Azure Data Factory instance and navigate the portal interface.
• Create a simple pipeline with a copy activity and trigger it manually.
• Set up a scheduled pipeline to run daily.

Need More Info? Chat with Us


15

Module 5 – Ingest data using Microsoft Fabric

5.1: Understand Dataflows Gen 2 in Microsoft Fabric


5.2: Explore and Integrate Dataflows Gen2 in Microsoft Fabric
5.3: Integrate Pipelines in Microsoft Fabric

Hands On:
Build pipelines and Dataflow solutions in Microsoft Fabric.

Module 6 – Orchestrate processes and data movement


with Microsoft Fabric

6.1: Understand pipelines for data engineering


6.2: Use pipeline templates
6.3: Run and monitor Pipelines

Hands On:
Run and monitor pipelines

Need More Info? Chat with Us


16

Module 7 – Time Intelligence in Microsoft Fabric

7.1: Introduction to real-time data analytics in Microsoft Fabric


7.2: Ingest, Transform, Store and query real-time data
7.3: Visualise real-time data in Microsoft Fabric
7.4: Introduction to Microsoft Fabric eventhouse
7.5: Work with KQL effectively
7.6: Explore materialized views and stored functions for Microsoft Fabric Certification

Hands On:
Work with real-time intelligence with usecase
work with data in a Microsoft Fabric eventhouse

Module 8 – Introduction to Microsoft Fabric Analytics

8.1: Start learning Fabric from Basics


8.2: Understand different Real World approch to work on Fabric for
data engineering
8.3: Explore the analytics capabilities of Microsoft Fabric.
8.4: Identify roles and steps to enable and utilize Fabric effectively.

Need More Info? Chat with Us


17

Module 9 – Lakehouse and Data Management in


Microsoft Fabric

9.1: Understand Real World lakehouse architecture for Data Engineering


Roles
9.2: Use Microsoft Fabric for data ingestion, transformation, and analysis
9.3: Manage and utilize lakehouses for Microsoft Fabric Data Engineer
Certification

Hands On:
Set up and manage a lakehouse in Microsoft Fabric to understand
Data Engineering usecase.

Module 10 -Implement Apache Spark within Microsoft


Fabric

10.1: Integrate Apache Spark with Microsoft Fabric.


10.2: Work on notebooks to ingest, transform, and load data into a lakehouse
with Spark.
10.3: Use PySpark for data analysis, transformation
10.4: Analyse real world data with Spark SQL, and structured streaming.

Hands On:
Conduct data analysis with Apache Spark.

Need More Info? Chat with Us


18

Module 11 –Manage Delta Lake Tables in Microsoft Fabric

11.1: Comprehend Delta Lake and delta tables within Fabric.


11.2: Create and handle delta tables using Spark.
11.3: Enhance the performance of delta tables.
11.4: Work on delta tables with Spark’s structured streaming.

Hands On:
Operate with delta tables in Apache Spark. usecase.

Module 12 –Master Fabric lakehouse with medallion


architecture design

12.1: Understand medallion architecture


12.2: Work on medallion architecture for Microsoft Fabric data engineer
certification
12.3: Query and report on data in the Fabric lakehouse

Hands On:
Implement end to end medallion architecture in Microsoft
Fabric

Need More Info? Chat with Us


19

Module 13 – Explore Data Warehouses in Microsoft Fabric

13.1: Define data warehouses within Fabric.


13.2: Differentiate between a data warehouse and a data lakehouse.
13.3: Work on data warehouses in Microsoft Fabric.
13.4: Create and manage fact tables and dimensions in a data warehouse.

Hands On:
Analyze data within a data warehouse.

Module 14 - Loading Data into a Fabric Data Warehouse

14.1: Explore strategies for loading data into a Fabric data warehouse.
14.2: Construct a data pipeline to populate a warehouse in Fabric.
14.3: Load data into a warehouse using T-SQL.
14.4: Load and transform data with Dataflows Gen 2

Hands On:
Populate data into a Fabric data warehouse.

Need More Info? Chat with Us


20

Module 15 - Manage a Microsoft Fabric Data Warehouse

15.1: Track capacity unit usage with the Fabric Capacity Metrics app.
15.2: Monitor current activities in the data warehouse using dynamic
management views.
15.3: Observe querying trends with query insights views.

Hands On:
Supervise a data warehouse in Fabric.

Module 16 -Protect a Microsoft Fabric Data Warehouse

16.1: Learn the concepts of securing a data warehouse in Fabric.


16.2: Implement dynamic data masking, row-level security, and column-level
security.
16.3: Configure detailed permissions using T-SQL

Hands On:
Secure a warehouse in Fabric

Need More Info? Chat with Us


21

Module 17 - Establish CI/CD Processes in Microsoft Fabric

17.1: Grasp the basics of CI/CD and their use in Microsoft Fabric.
17.2: Configure version control with Git repositories.
17.3: Leverage deployment pipelines to streamline the deployment workflow.
17.4: Automate CI/CD tasks using Fabric APIs.

Hands On:
Create and manage deployment pipelines in Microsoft Fabric.

Module 18 – Oversee Microsoft Fabric Operations

18.1: Apply monitoring techniques to manage activities in Microsoft Fabric.


18.2: Track performance and operations with the Monitoring Hub.
18.3: Trigger actions using the Activator feature.

Hands On:
Monitor Fabric activities through the Monitoring Hub.

Need More Info? Chat with Us


22

Module 19 - Manage Data Access Security in Microsoft


Fabric
19.1: Understand Microsoft Fabric’s security model for data engineering.
19.2: Configure permissions for workspaces and items.
19.3: Enforce granular controls to protect data.

Hands On:
Set up data access security in Microsoft Fabric.

Module 20 - Administer Microsoft Fabric

20.1: Outline administrative duties in Microsoft Fabric.


20.2: Use the Admin Center to manage settings.
20.3: Control and manage user access permissions.

Need More Info? Chat with Us


23

Module 21 : Pyspark for Data Engineering

21.1: Spark Session


21.2: Basics of RDD
21.3: Dataframes and its creation
21.4: Data sources (using CSV and Parquet) and dataframe reader
21.5: Data tragets and Dataframe writer
21.6: Spark SQL in PySpark
21.7: Spark UI

Hands On:
Create a SparkSession in PySpark.
Load a CSV file into a Data Frame using spark.read.csv() with schema
inference enabled.
Create a DataFrame from a CSV file using spark.read.csv().
Write data to an Azure Blob Storage container.

Module 22 - DataBricks

22.1: Introduction to DataBricks


22.2: Azure Databricks Architecture Overview
22.3: Create resources with Azure Databricks workspace
22.4: Introduction to databricks Cluster
22.5: Databricks cluster pool

Need More Info? Chat with Us


24

Module 23 - Delta Lake on Databricks

23.1: Understand Delta lake architecture


23.2: Work on Delta lake tables on Databricks
23.3: “Read and write data in Azure Databricks
23.4: Ingestion, Transformation in Databricks
23.5: Work with DataFrames in Azure Databricks
23.6: Work with DataFrames advanced methods in Azure Databricks

Module 24 – Introduction to Snowflake

24.1: What is Snowflake?


24.2: Snowflake’s use cases in data engineering
24.3: Setting up Snowflake
24.4: Creating a Snowflake account
24.5: Setting up the Snowflake environment
24.6: User roles and permissions
24.7: Navigating the Snowflake Web UI

Need More Info? Chat with Us


25

Module 25 – Data Types and Structures in Snowflake

25.1: Supported data types (BOOLEAN, INTEGER, STRING, etc.)


25.2: VARIANT data type for semi-structured data (JSON, XML, Parquet)
25.3: Tables (Permanent, Temporary, Transient)
25.4: Snowflake Architecture Deep Dive
25.5: Cloud Services Layer, Compute Layer, Storage Layer
25.6: Micro-partitioning and its benefits
25.7: How data is stored and accessed in Snowflake

Module 26- Data Storage and Performance

26.1: Time Travel and Fail-safe


26.2: Zero Copy Cloning
26.3: Snowflake’s automatic scaling and partitioning
26.4: Loading Data into Snowflake (Data Engineering)
26.5: File formats supported by Snowflake (CSV, JSON, Parquet, Avro)
26.6: Using Snowflake’s COPY command

Need More Info? Chat with Us


26

Module 27- Data Transformation in Snowflake

27.1: Using Snowflake’s SQL capabilities for ETL


27.2: Creating and managing stages
27.3: Data Transformation using Streams and Tasks
27.4: What are Streams and Tasks?
27.5: Implementing real-time ETL pipelines using Snowflake
27.6: Automation and scheduling tasks in Snowflake
27.7: Snowflake’s Integration with Data Lake and Data Science Tools
27.8: Connecting Snowflake to BI tools like Tableau, Looker, Power BI

Module 28- Snowflake Performance Optimization

28.1: Understanding virtual warehouses in Snowflake


28.2: Optimizing virtual warehouse size and performance
28.3: Auto-suspend and auto-resume configurations
28.4: Clustering Keys
28.5: Query profiling and performance tuning
28.6: Caching in Snowflake
28.7: Star schema vs Snowflake schema

Need More Info? Chat with Us


27

Module 29- Snowflake Security and Governance

29.1: Authentication and Authorization


29.2: Role-based access control (RBAC)
29.3: Data encryption at rest and in transit
29.4: Auditing and monitoring usage
29.5: Setting up data sharing and data masking
29.6: Access controls for sensitive data
29.7: Sharing data securely with other Snowflake accounts
29.8: Using Snowflake’s secure data sharing feature
29.9: Data sharing best practices

Module 30- Airflow

30.1: Introduction of Airflow


30.2: Different Components of Airflow
30.3: Installing Airflow
30.4: Understanding Airflow Web UI
30.5: DAG Operators & Tasks in Airflow Job
30.6: Create & Schedule Airflow Jobs For Data Processing

Need More Info? Chat with Us


28

Module 31- Kafka

31.1: Need for Kafka


31.2: What is Kafka
31.3: Core Concepts of Kafka
31.4: Kafka Architecture
31.5: Where is Kafka Used
31.6: Understanding the Components of Kafka Cluster
31.7: Configuring Kafka Cluster

Hands On:
Configuring Single Node Single Broker Cluster Configuring Single Node Multi-
Broker Cluster

Module-32 CV,Interview Prep & LinkedIn Profile update

32.1: CV Preperation
32.2: Interview Preperation
32.3: LinkedIn Profile Update
32.4: Expert Tips & Tricks

Need More Info? Chat with Us


29

Projects
Project 1

Data Lake Integration and Optimization with PySpark

Ingest data into a data lake and apply PySpark for data
integration, transformation, and optimization. Create a
system that maintains a structured data repository
within a data lake to support analytics.

Project 2
Retail Sales Data Warehouse in Snowflake

Create a robust data warehousing solution in Snowflake for a


retail company. Ingest and transform sales data from various
sources, enabling advanced analytics for inventory management
and sales forecasting.

Project 3

ETL Orchestration and Automation with Apache


Airflow

Build a comprehensive ETL (Extract, Transform, Load)


pipeline that automates the extraction, transformation, and
loading of data into a data warehouse. Implement
scheduling, error handling, and monitoring for a robust ETL
process.
30

Project 4
Data exploration and transformation in Azure
Databricks
Perform standard DataFrame methods to explore and
transform data.Key Points:Create a lab environment. Azure
Databricks cluster.

Project 5
Customer Sentiment Analysis Dashboard with
Microsoft Fabric.

Develop a comprehensive dashboard that aggregates and


analyzes customer feedback from various channels to provide
real-time insights into customer sentiment, enabling businesses to
make informed decisions to enhance customer satisfaction.

Project 6
Data Integration and Storage with OneLake

Integrate data from various sources (CSV, JSON, SQL) into


OneLake, a unified data lake. You'll clean and transform the data
before storing it, making it ready for analysis. This project will
help you utilize OneLake for centralized, scalable data
management within Microsoft Fabric.
31

PLACEMENT SUPPORT
"The More You Learn, The More You Earn"

Resume
Building

LinkedIn Profile
Upadation

Interview
Preparation

Sample
Exam Papers

Need More Info? Chat with Us


32

5,000+ Careers Advanced

Aman Yadav
Senior Associate Product Compliance at Amazon

Journey From Non-IT to IT

Abhishek Pareek
Technical Manager at Capgemini

Got 200% Salary Hike

Vishal Purohit
Product Manager at Icertis

Immense Job opportunities

Ranjith Kumar
Developer Cissco

Got 2 job offers with 400% salary hike


33

learn from
Industry Experts
Nitesh Prajapat
Senior cloud architect at Capgemini
Microsoft , Adobe & amp ; AWS certified young and dynamic
professional with 14+ years of industry experience . Trained 8200+
professionals , and counting , on all cloud technologies .

Vaibhav
Cybersecurity manager at Cisco
Always learning and always yearning , have delivered more than
200+ mentoring / training sessions online on Cyber Security , CEH ,
CCNA , Microsoft Azure , and Network Security to more than 20000
students .

Neeraj
Python Senior Data Engineer at IBM
Skilled in Python , Microsoft Azure , ETL , Big Data , Herok technolo
gies and frameworks with practical hands - on experience from the
last 12 years .

Venkatesh G
Engaging , understanding , and knowledgeable technical trainer with
over 18 years of experience in a variety of software pro grams and
technologies including Data Science , Artificial Intelli gence , Data
Engineering , Business Intelligence AWS , Azure and GCP , DevOps ,
AEM , IOS , UI / UX .
34

Rahul Sachdeva
Network Manager - TCS
An Information security enthusiast and a security researcher for the
past 10+ yrs . He specialises in domains ranging from responsible
and diversified exposure in project & amp ; program management ,
System , Network & amp ; Internet Security .

Manohar G
Senior Analyst at Accenture
15+ Years of Experience in Automation , DevOps , Software Testing
with Testing Tools ( Industry & Training ) . Having Experience in
Manual Testing , and Automation & amp ; Performance Testing

Sidharth
Hewlett Packard DevOps Engineer
As a Data Engineer , I have 15+ experience working on different cloud
environments on various projects.I currently lead multi - work stream
projects as a Cloud analyst and DevOps Engineer.Work on DevOps Tools :
Git Hub , Jenkins , Ansible , Docker , Kubernetes

Manish Jaiswal
Technical lead Pfizer
Takes real pride and joy in his work , a cloud and web develop ment
professional having an experience of 12+ Years in Cloud cost
optimization , Cloud solution architect , and Disaster . recovery .

Pratik K
Group head Analyst JPMorgan
Expertise in Tableau and SQL from the last 12 years . As a Tableau
Developer , Proficient in designing and developing data visual izations
using Tableau and working with Tableau Desktop , Tableau Server .
35

What Our Learners


Have To Say

Manish Saini
Vise President Fit Web
We wanted to train our team in Data lake , snowflake and shared the
requirement with Prepzee we are thankful they have given the
curriculum which we were looking for and went with the training for a
team of 20 people including me . The training was amazing .

Nishant Jain
Senior Devops Engineer at Encora
I enrolled for DevOps Azure certification . The course gave me the right
number of pointers and resources to handle my ongoing projects in
DevOps to handle the team of DevOps and Azure .

Amit Sharma
Senior Software Engineer at Visa
My overall experience was great and Prepzee has helped me in
switching my career from java developer to cloud by providing training
from industry experts . I enrolled for Azure Certification training which
covered all pointers of Azure .
prepzee.com

For Further Details, Contact Us

hi@prepzee.com

+1 415-481-4467

+91-7793068396

Chat with Us

We are available 24*7

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy