Data Analyst 3
Data Analyst 3
● 13 years of IT experience in Data Engineering, Cloud, Big Data, Visualization and Reporting.
Analytical programming using Spark Scala, PySpark, Kafka, AWS, SAS, SQL, Python, Azure Databricks and
Snowflake.
● Worked as Team Lead, Product Owner, Onsite Coordinator, Data Engineer, Sr.Developer and Analyst in
various agile sprint projects. Certifications: AWS Certified Data Analytics – Specialty, Databricks Certified –
Spark Developer Associate, Snowflake - SnowPro Core Certification, AWS Certified Developer – Associate,
AWS Certified Solutions Architect - Associate.
● Experience in all phases of Software Development Life Cycle (SDLC)-Waterfall, Agile Process across various
workflows (Requirement study, Analysis, Design, Coding, Testing, Deployment and Maintenance) in Business
Intelligence application development.
● Experience in Scala FP with cats IO using kestrel framework, falcon configuration, quality guard, sidecars and
tallyho event signaling tool.
● Extensive experience in developing ETL and BI applications using Spark Scala, PySpark, AWS data analytics
services EMR, Glue, Lambda, EC2, Athena, S3, Azure Databricks, Snowflake, SAS technologies.
● Experience with several python libraries including pandas, matplotlib, boto3, pytest, moto and fastapi.
● Hands-on experience with industry-standard IDEs like IntelliJ, PyCharm, Notebook, VS Code, Cloud9, SAS EG.
● Strong experience in Data step, Proc SQL, Merge, SAS DI transformation such as table loader, loop, SCD, user
written code, extract, lookup, append, rank, sort.
● Proficient in Spark libraries like Core, SQL, MLlib, Streaming, Data frame API using Scala and python.
● Experience in data formats including Apache iceberg, Parquet, ORC, AVRO, Delta, CSV and JSON files.
● Experience in build tools like Gradle and maven, GitHub repository, RIO CI\CD pipeline for deployment,
docker images and artifacts for Kubernetes pod for spark applications.
● Experience in Developing Spark applications using Scala, Python, Spark - SQL in K8, yarn cluster for data
extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data
to uncover insights into the customer usage patterns. Developed various UDFs and connect to various APIs
and parse Json response in the job flow. Application and data migration from on-prem to cloud architecture.
● Experience in Apache Airflow DAG pipelines using data-pipelines yml file for scheduling spark kube jobs.
● Scala functional programming style with kestrel framework which includes cat effects libraries for pure
functions with reliable and concurrency.
● Experience in AWS CDK\CloudFormation, AWS SAM CLI, terraform with GitHub and code pipeline.
● Expertise in full life cycle application development and good experience in Unit testing and Test-
Driven Development (TDD) and Behavior driven Development.
● Proficient in writing SQL Queries, Stored procedures, functions, packages, tables, views, triggers using
relational databases like TD, PostgreSQL. Cassandra database used with Kafka for streaming projection
project.
● Good experience in Shell Scripting, SQL Server, UNIX and Linux. Performance improvement for spark jobs.
● Having experience in Agile Methodologies, Scrum stories and sprints experience in a Spark based
environment along with data analytics, data wrangling and data extracts.
● Experience on connecting to various data sources like AWS, Snowflake, Azure, HDFS, JDBC, Postgres.
● Worked on enterprise data management systems such as HDFS, KAFKA, FS2 stream, Map/Reduce
functionality and experience on processing large data sets using spark library in Scala applications.
● Well versed with REST API using Python Fast API, SQL alchemy and AWS API gateway. Snowflake – Snow
pipe, Snowpark, Stored Procedure & UDFs. Databricks PySpark application with delta lake & tables.
● Projects using AWS services like Kinesis, Redshift, DynamoDB, MSK, EKS, SQS, SNS, API gateway, CloudWatch,
Step Functions, Secrets Manager, DataSync using AWS console, CDK, Python boto3 & CLI.
● Highly motivated, dedicated, quick learner and have proven ability to work individually and as a team.
● Good experience in Apple billings, Insurance domain, Actuarial, Fraud Management, Auto & Property team,
Claims, Policy reporting team.
● Experience in sidecar integration testing with Scala FunSuite for spark jobs, quality guard tool and code
coverage using SonarQube tool.
Technical Skills:
Programming Languages Python, Scala, Java, SAS
Big Data Environment Spark, AWS EMR, EKS, Glue, Hive, HDFS
Query Languages SQL, PL/SQL
Operating Systems macOS, Linux, Windows
Build & Deployment Tools Gradle, Maven, GitHub, Jenkins, Docker, RIO CI\CD, CDK, SAM
Scheduling Tools Airflow, AWS step function, AWS glue workflow
Databases PostgreSQL, DynamoDB, Hive, Teradata
Cloud Computing AWS, Azure Databricks, Snowflake
Methodologies Agile Scrum and Waterfall
IDEs IntelliJ, PyCharm, Jupyter Notebook, Cloud9, SAS EG, VS code
Professional Experience:
Environment: Spark 3.2, Scala, Kafka, AWS, Snowflake, Teradata, Python, API, K8, Jenkins, Apache Airflow.
Environment: PySpark, Python, AWS Glue, Lambda, Azure Databricks, Snowflake, Terraform, Apache Airflow,
PostgreSQL, Jupyter Notebook, Delta lakehouse.
Environment: Python, API, Spark, Kafka, HDFS, Hive, SQL, SAS, Oracle, MS-SQL.
Certifications: https://www.credly.com/users/dhivagar-mariappan.20f32700
Educational Qualification: B.E computer science and engineering (2007 - 2011) – MEPCO Schlenk Engineering
College, Sivakasi, Tamil Nadu, India- 626005.