STUTI - GUPTA Hadoop Resume PDF
STUTI - GUPTA Hadoop Resume PDF
WORK EXPERIENCE
Organization - Datametica Solutions Pvt. Ltd., Pune
Technical Competencies
Current Designation - Senior Member of Technical Staff Groovy
Duration - June 2016 - Present Google Cloud Platform (GCP)
Raven Product Java
Description: A tool that enables enterprises to run their existing applications Apache Storm
written for a specific database to a modern database, without rewriting or Apache Pig
configuring them. It converts SQL from one dialect to another dialect, producing Hive
not just the equivalent query but also an optimized one. R
Role: Developer, Tester Bash Scripting
Responsibilities:
- Design and develop optimized code with consideration of design pattern and Databases / Warehouses
design principles. Teradata
- Lead a team of 7 contributors along with designing a roadmap for the product. Netezza
- Use Apache Calcite API for building expressions in relational algebra. Oracle
- Open source contribution to Apache Calcite. Hive
- Wrote spark and Bigquery UDFs to facilitate the equivalent operation. MySql
Google BigQuery
- Directed Proof Of Concepts in parallel with the fast-paced development of the
ongoing project.
- Code review with clean coding standards and compliance with the requirements. Frameworks
- Develop test cases for testing and test the same using Spock testing framework.
Technologies: Groovy, Hadoop, SQL of various dialects (Teradata, Spark, Hive, Hadoop
Map Reduce
Greenplum, Google BigQuery, Netezza), Apache Calcite. Sqoop
HiveRunner
Spock
Eagle Product
Description: Provide data migration pipelines which consist of tables/views with
Tools & IDEs
their correct order to migrate. Analyze the database for access pattern and usage
Git
of table/views and suggest database optimization. This leads to feasible database Eclipse
migration of data from the source database to the target database. IntelliJ
Role: Developer, Tester Jira
Responsibilities:
- Develop a unified model for the product to extract the data from various data
sources like Teradata, Netezza, Oracle, DB2, Greenplum and transforming the data
from each source to a common form for the product as an input. Awards and Recognitions
- Extract data with sqoop for Hadoop environment, JDBC for non-hadoop, Teradata - Awarded with Spot Award for
Connector for Teradata along with automation using bash script. my ownership towards my
- Write optimized Hive scripts to transform data. work and for the commendable
- Write unit test using HiveRunner framework. clean code.
Technologies: Sqoop, TDCH, HQL, Java, Hive Runner, Bash/Groovy Scripting - Appreciated for my hard work
and dedication by the
co-founder of Datametica.
Cloud-Based Self Service Analytics and BI - Silver medalist of my batch in
Description: Complete ETL + visualization using GCP products. B.Tech(CS).
Role: Developer - Won various team building
Responsibilities: games Nirman, Contraption.
- Designed the architecture for transfer of the data from onprem traditional - Won coding events organized
systems to Google Cloud Platform (GCP) by the college like Web-Dev,
- Extraction of data with Informatica and cleansing of data with Trifacta , Paxata Geek-O-Website, Code Jam.
and Cloud Dataprep.
- Transformation of data with Spark sql and Google Bigquery. Extra-curricular Activities
- Visualize with Google Cloud DataLab and Cloud Data Studio. - Organized Quiz Competition
- Research and prototype models on tools like Compute Engine, Cloud Storage, in college twice.
BigQuery, Cloud Dataprep, Cloud Dataproc and third-party tools like Trifacta, - Participated in Marathon
Paxata. every year organized by college
and district.
- Was the chief student
EDW to DDH coordinator in zonal rounds at
Description: EDW offload strategy tool that migrates data from Teradata to Hive Invertis University conducted
taking into consideration of workflows/data dependencies. by NARC India, India’s Biggest
Role: Developer Android Championship.
Responsibilities:
- Write shell scripts to schedule the job.
- Write Hive scripts for data transformation.
Technologies: Hive, Shell Scripting, Azkaban Scheduler.
NATIVE-APP
Description: Analyze the mobile app data of a retail US client and provide valuable
insights for business decision making.
Role: Developer and Data Analyst
Responsibilities:
- Write Hive and Pig scripts for cleansing and transforming the data.
- Scheduling Hadoop jobs.
- Write a script for loading and running the scripts.
- Prepare KPIs as per the business requirement.
- Write test cases for Hive and Pig scripts using Wutz framework.
Technologies: Apache Pig, Hive, Bash Scripting, Wutz, Azkaban.
ACADEMIC PROJECTS
KeyStroke Dynamics Based Continuous Authentication System
A research-based project for authentication using keystroke dynamics.
Written paper on the Keystroke dynamics authentication by comparing it with
other authentications.
Developed a system to capture individual keystroke pattern and allow login to
Examination portal based on the recorded pattern.