0% found this document useful (0 votes)
34 views

Ankitkumar Pranaykumar Sinha Data Engineer

Uploaded by

ameya vashishth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Ankitkumar Pranaykumar Sinha Data Engineer

Uploaded by

ameya vashishth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Ankitkumar Pranaykumar Sinha

Mobile: +91-8626-064676
Email-ID: sinhaankit.mca2014@gmail.com

Career Objective
To be involved in work where I can utilize my skills and gain self-satisfaction. That
effectively contributes to the growth of organization.

Technical Skills
Backend: SQL, Teradata, Pyspark, Sqoop, Pig, Python, Scala, spark.
Scripts: Shell Scripts, JavaScript, Python
Web Technology: HTML, CSS, Bootstrap
Tools: Control M, Airflow, Jira, Git, Talend.
Google Cloud Platform: Big Query, Google Storage, Dataproc, Dataflow, AirFlow

An Overview
• 4.10 years of professional experience of Data engineering projects having
worked in Big Data Technologies such as Hadoop, Spark, Sqoop, Pig, Python and
tools on Google Cloud Platform(GCP) such as Big query, Dataproc,
Spark,Scala,Hive, Python, Talend tools.
• Experience of working with Data transformation, , Ingestion, writing the ETL
logic and warehousing solutions across platforms involving Teradata , Google
Cloud Platform
• Experience in migration of a project from on premises Hadoop server
to cloud. Scheduling of script is done by using Airflow using composer.
• Experience in loading data to Hive table using Spark.
• Experience of end to end pipeline (File based load, Kafka load)

• Experience in scheduling the DAG’s on Airflow using Cloud Composer.

Organizational Scan
• April 2021 to current: GSPANN Technology as Senior Software Engineer.
• March 2020 to April 2021: Deloitte as Consultant
• April 3018 to March 2020 : Sears Holding India as Associate Engineer
• August 2017 to April 2020: Sears Holdings India as Intern
• Jan 2017 to August 2017 : Vision Media Entertainment as Intern
Project Overview

• Title : Senior Data Engineering (GSPANN) Working with Pharma Client

• Title: EAP Operate (Deloitte) Working for a Finance client.


• Title: Data Engineering (Sears Holding India) Working for Retail client.
• Title: Assessment Portal (Vision Media entertainment)
Other Projects
• Assignment Submission
• Portal Zing Bikes
• Passport Facilitation
System (Pune Police)
Significant Highlights

• At GSPANN
I am working for a Pharm client. Involve in POC in my early days. Created end to end generic
pipeline for file-based ingestion and API ingestion. Later on working on cost optimisation for project.
We migrated 18 different composer instances to a single common composer instance to reduce the cost
of the project. Migrated almost 150 DAG’s to new environment. POC is in progress to setup the CI/CD
pipeline for DAG deployment.
Also created the End to End pipeline to setup a ELN system. Our source is Oracle and destination
Datawarehouse is Bigquery. We are using Talend as an ETL tool for transforming the data. All the jobs
are created in the Talend Studio and we had used the windows scheduler to trigger the jobs.
Created a Pipeline to connect with the SAP and Salesforce data source. Directly pulling the raw
data from the source and getting it loaded to the Bigquery staging layer. All the business ETL are written
in Biquery SQL and we have used Airflow to schedule the DAG’s.
Recenty in the current project we have created the connection with Postgrey, Sharepont and MS
project online to integrate the various source and pull all the data into a single data warehouse. On top of
that date we are developing the various model to get the better insites for the sales.
Also worked on a project in which we are using the web scrapping mechanism. We are targeting
to the 10 different website from where we are scrapping the data on weekly basis and loading it in a
Bigquery. All the data cleaning are taken care into the Bigquery ETL. We have used the Python for
scraping the data and created DAG to load the scrapped data to bigquery on a regular basis.

• At Deloitte (Office of US)


I was working in EAP Operate project. We manage the data of client on Hadoop system. We
mainly looks at data loads and data cleaning part. We process the raw data from source and
make it available in consumption layer for reporting. We mainly maintain the data in Hive and
in some rare cases in Vertica too.

• At Sears Holding India


In Sears I was working in Data Engineering project. Sears currently run their analytics on
Teradata but is moving towards using Google Cloud Platform. The primary purpose of the
project is to provide the data currently on Teradata to Google Cloud Platform. The project
includes migration of existing tables from Teradata to GCP and setting up of the process to
emulate the ETL process existing on Teradata for the Big Query tables on GCP. It involves
engineering new methods to load the data into Big Query i.e. to overcome the limitations of
Big Query

Academic Credentials
2017 Master of Computer Applications - Vishvakarma Institute of Technology,
Pune Secured 73.27 %
Personal Dossier
Date of birth: 23rd February, 1994
Residential: R-1004, Jade Residency, Whagoli, Pune

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy