MJB Resume
MJB Resume
divya@soleratek.com
404-474-3792
linkedin.com/in/jawahar-macha/
Professional Summary:
Over 10 years of combined experience as a Site Reliability Engineer & DevOps Engineer with expertise in
developing, building, deploying, automating, and releasing code across diverse environments.
Collaboration expert with extensive proficiency in using Git, GitLab, and GitHub, skilled in managing version
control and fostering team collaboration in software development.
Expertise in designing and implementing CI/CD pipelines using tools like GitHub Actions, Jenkins,
GitLab CI, and Azure DevOps, ensuring seamless integration, testing, and deployment processes
across multiple environments.
Proficient in leveraging GitHub Actions for seamless integration with GitHub repositories, enabling
workflow automation directly from source control.
Proficient in GitOps methodologies to achieve declarative infrastructure management and automated
deployments with tools like ArgoCD and Flux, ensuring consistency and auditability across
environments.
Leveraged efficient management of pipelines across repositories and integrating with third-party tools
like SonarQube for enhanced security and compliance.
Skilled in container orchestration and management using Kubernetes (K8s) and tools like Helm
and Kustomize to deploy and scale microservices efficiently. Managed workloads across EKS, AKS,
and GKE, optimizing cluster performance and cost.
Proficient in a wide range of Azure services including but not limited to Azure Virtual Machines,
Azure DevOps, Azure Active Directory, and Azure Resource Manager Templates, enabling the
design and implementation of scalable and secure cloud infrastructures.
Hands-on experience migrating existing on-prem infrastructure to AWS Cloud using its important
services like VPC, R53, EC2, S3, IAM, ELB, CF. Good understanding of AWS Cloud
architecture.
Handled huge infrastructure for Kubernetes cluster on public & private clouds. EKS, AKS.
Adept in implementing and managing Azure Kubernetes Service (AKS) for container orchestration, enhancing the
deployment and scalability of microservices.
Hands-on experience with CloudFormation and Terraform, well versed with concepts of Infrastructure as Code.
Demonstrates exceptional skill in using Terraform for infrastructure as code (IaC) to automate and manage
Azure cloud environments, ensuring efficient and error-free deployments.
Skilled in creating and maintaining scalable and secure infrastructure, utilizing both Terraform
and Pulumi to support multi-cloud strategies effectively.
Expertise in Docker for containerizing applications, building, and managing container images, as
well as orchestrating multi-container environments using Docker Compose and Docker Swarm.
Adept at employing a systematic approach to tackle complex system issues, reducing manual work
(toil) through automation, and improving system performance and uptime.
Proficient in designing self-healing Kubernetes clusters, implementing Horizontal Pod
Autoscalers (HPA), Vertical Pod Autoscalers (VPA), and Custom Resource Definitions
(CRDs) to ensure resilience and scalability.
Integrated database monitoring and observability into centralized logging solutions like the ELK
Stack and Splunk, providing comprehensive insights into query performance, slow logs, and
transaction bottlenecks.
Extensive experience managing relational databases (RDBMS) such as MySQL, PostgreSQL,
Microsoft SQL Server including tasks like schema design, query optimization, and performance
tuning to ensure high availability and efficient data access.
Passionate about leveraging Backstage, Azure services, and SRE principles to build robust,
automated, and developer-friendly environments that enhance productivity and system reliability.
Proficient in setting service level objectives (SLOs) and indicators (SLIs), ensuring that
reliability goals align with user expectations and business requirements.
Expertise in monitoring and analytics, leveraging advanced tools like ELK, Grafana,
Prometheus, New Relic and Datadog. Adept in integrating Datadog for its robust cloud-scale
monitoring capabilities, providing a unified view of infrastructure, applications, and services.
Adeptness in leveraging PagerDuty's incident response platform for real-time incident tracking,
escalations, and automated scheduling.
Led initiatives to reduce Mean Time to Detect (MTTD) and Mean Time to Recover (MTTR)
by automating monitoring, alerting, and remediation workflows using tools like Prometheus,
Grafana, Datadog, and PagerDuty.
Proficient in configuring and managing OpsGenie for advanced alerting and on-call management,
effectively routing critical alerts to the right team members and ensuring timely response.
Automated infrastructure provisioning and configuration using Terraform, CloudFormation, and
Ansible, integrated with CI/CD pipelines for end-to-end automation.
Strong advocate of DevSecOps, embedding security checks into CI/CD pipelines and enforcing
policies for container and Kubernetes security using tools like OPA/Gatekeeper and Kube-bench.
Passionate about leveraging AI and observability best practices to create resilient, scalable, and
efficient systems, combining cutting-edge technology with proven engineering principles to deliver
exceptional results.
Managed critical incidents and outages, leading cross-functional teams to resolve business-
impacting issues promptly while maintaining clear and effective communication with stakeholders,
fostering collaboration, and ensuring optimal resource utilization.
Facilitated hotwash sessions and blameless postmortems following critical incidents to analyze
root causes, document key learnings, and implement actionable improvements, enhancing system
reliability and reducing future downtime.
Utilized insights from blameless postmortems to refine Service Level Objectives (SLOs), improve
system reliability, and align operational priorities with business goals.
Built a reputation as a reliable and approachable team player, always ready to assist in
troubleshooting, share knowledge, and mentor team members to achieve personal and professional
growth.
Professional Experience:
Conservice – Parsippany, NJ June 2021 – Present
Site Reliability Engineer:
Analyzed high-level architectural diagrams of customers along with Enterprise-level Principal Architects to
recommend the best cost-optimized, performant solutions for desired use cases.
Proficient in managing version control with Git, and repository management using GitHub and GitLab, ensuring
robust code management and collaboration in .NET environment.
Championed the adoption of GitOps principles, streamlining deployment workflows and infrastructure updates,
ensuring consistent and auditable changes across environments. Developed custom GitHub Actions workflows,
optimizing build and deployment cycles and reducing manual intervention by 50%.
Configured build pipelines and deployed SonarQube for code analysis. Experienced and proficient in
deploying and administering GitHub server instances.
Orchestrated CI/CD pipelines using Jenkins, GitHub Actions, and ArgoCD, enhancing the automation, efficiency
and reliability of software deployment processes.
As a part of the Platform Engineering and SRE team, primarily working on Kubernetes clusters deployed in Azure
and private on-prem cloud. Handling around ~300+ clusters including all the environments.
Administered Kubernetes clusters for scalable and resilient application deployment, ensuring high
availability and optimal resource utilization.
Implemented ArgoCD to manage Kubernetes deployments through a declarative GitOps approach,
ensuring high reliability and synchronization between desired application states and deployed environments.
Leveraged Docker for containerizing .NET applications, improving portability and consistency across
development, QA, Staging, and production environments.
Assisted in packaging applications in containers using Docker and preparing executable components.
Configuring VMs via terraform scripts, Orchestrating via Docker.
Worked on Azure Cloud infrastructure and its security, used Terraform to maintain the entire infrastructure,
and worked on aligning application insights and high-alert findings.
Proficient in managing and scaling applications on Microsoft Azure and on-prem cloud environments,
achieving an optimal balance between cloud and on-prem resources.
Implemented infrastructure as code (IaC) using Terraform and Pulumi, leading to a 50% reduction in
infrastructure provisioning and configuration time.
Contributed to the development and maintenance of a developer portal using Backstage, creating a
a centralized hub for tooling, services, and documentation.
Worked on implementing Apache Kafka, Zookeeper and configured Confluent components on
Ubuntu machines. Assisting in implementing Kafka security, and schema registry.
Evaluating and adopting new tools and technologies to improve the development workflow, alongside
collaborating with Operations. Manage Infrastructure with configuration files including databases,
and firewall policies using Terraform. Automate repetitive tasks and implement Infrastructure as Code(IaC)
practices for infrastructure provisioning and configuration management.
Experience with cloud technologies such as Azure and other cloud-native applications, configuring and
managing cloud infrastructure, ensuring high availability and fault tolerance. Managed multiple Azure VMs and
deployed them on a component basis.
Perform execution of functional test plan, validate test results, and prepare documentation and data for
analysis. Monitoring system health and performance using Grafana, Prometheus, ELK Stack, and Datadog.
Optimizing system performance and resource utilization to deliver an exceptional user experience.
Ensuring compliance with industry security standards and implementing measures to enhance data
protection. Provide service-level communication between services via service mesh.
Establish SLAs, SLOs, and SLIs for service uptime, and build the necessary telemetry and alerting platform
to enforce them. Conducting post-mortems and driving continuous improvement initiatives.
Environment: Windows, Linux/Unix, ASP.Net, NextJS, Yaml, Git, GitHub, GitLab, SonarQube, Jenkins, Shell, Azure,
Docker, Kubernetes, Grafana, Prometheus, ELK, Datadog, Keycloak, Apache Kafka, GitHub Runners,
GitHub Actions, ArgoCD, Backstage, Vault, Terraform, Pulumi, Strimzi Kafka, Azure Service Bus, Backstage, Pulumi.
Dept of California, Child Support Services (DCSS) – Rancho Cordova, CA July 2017 – Apr 2020
DevOps Engineer/Systems Software Specialist
As a part of Lift-and-Shift migration activity, worked on building & maintaining infrastructure for
multiple environments in AWS Cloud that include Non-Prod, Test, and Production.
Assisted in Configuration Management and Build support for more than 4 different applications, building and
deployed to the production and other SQT, SYT, SYA, and DEV environments.
Implementing Microsoft Azure and helping with the migration procedures from IBM Cal Cloud.
Defined and Implemented CM and Release Management Processes, Policies, and Procedures.
Administration of Agile Central which includes installation, configuration, and testing on both sandbox
and production servers.
Creating Ops Hub Integration Manager (OIM) for connections between the tracking tool and version control
system Git and utilize it for GitLab.
Built AWS Cloud Infrastructure on a couple of mobile applications and made it ready for future use. Cloud
Delivery solutions to improve cloud architecture for AWS Public and private cloud.
Initialized zero downtime deployments and maintained CI tools like Build Forge and Jenkins.
Created the master, slave, and jobs for different environments and maintained the complete environment in
Jenkins, and provided full working documentation.
Wrote Ansible playbooks for one of the applications which made deployment easy for the EFS project.
Implemented the Maven scripts for the Kiosk and successfully implemented them on Tomcat servers.
Built the Maven scripts for adding the Deploy plugin and adding the Tomcat server.
Deployed the WAR file on the Tomcat Server for project AB-976.
Migrated all the projects from ClearCase to GitLab, created projects in GitLab, and made necessary changes to
the projects. Tested the Docker pipelines, monitoring, and processing work.
Acquired comprehensive requirements from Project Managers and Team Leads about the servers to be
migrated. Utilized ServiceNow for Data Centre service requests (Network requests for IPs, ILOs, DNS).
Contributed to the internal Cloud set-up specific to Azure virtual machines, Azure Active Directory
Integrated Azure Active Directory for user authentication/authorization for organization profiles
Deployed Azure VM using Azure portal using cross-platform CLI. Used Cloud Foundry for automated
deployment on Azure
Expertise in Upgrades, installs, configuration, and administration of security and monitoring tools on Linux.
Experience in Amazon Cloud (EC2, S3, Auto Scaling, IAM) Hosting and AWS Administration
Good knowledge of managing and integrating code Quality tools like SonarQube, managing sonar rules,
Quality gates.
Monitored servers for CPU Utilization, Memory Utilization, and Disk Utilization for performance monitoring.
Environment: Windows, Linux/Unix, Ant, Maven, ClearCase, ClearQuest, GitLab, Jenkins, Shell, Cal-cloud, Agile Central,
AWS, Release Automation, CA ARD, Ansible, Azure, Docker, Kubernetes, BuildForge.
Delivery and application modernization blueprints and tools to help you develop more effectively on AWS
Worked on Amazon Web Services (EC2, ELB, VPC, S3, Cloud Front, IAM, RDS, Route 53, Cloud Watch,
SNS) Setup/Managing Linux Servers on Amazon (EC2, EBS, ELB, SSL, Security Groups, RDS and IAM).
Setup/Managing Databases on Amazon RDS. Monitoring servers through Amazon CloudWatch, SNS
Created/Managed DNS records on Amazon Route 53 & supported and managed the services within AWS
and global IT infrastructure.
Managing Nagios and Cloud Watch to monitor critical system health, performance, security, and disk usage.
Continuous automated builds based on polling the GIT source control system during the day and periodic
scheduled builds overnight to support development needs using Jenkins, and GIT.
Configured JIRA, OAuth Java Client and checked the necessary policies required. Used MongoDB for queries
and regular expressions searches and configured them and used MongoDB load balancing and file storage.
Expertly implemented and configured Jenkins, achieving full Continuous Integration and Continuous
Deployment (CI/CD) to integration environments on commit, streamlining the development pipeline and
ensuring rapid, reliable deployments.
Maintained Infrastructure automation Maintained cookbooks about the source code and applied it
to the chef servers and utilized Testing tools such as Test Kitchen, Chef Spec, and Food Critic.
Expertly troubleshot and optimized continuous integration and automated deployment processes using Jenkins,
Chef, Maven, Ant, and Docker, significantly improve build efficiency and deployment speed.
Microsoft Windows Azure Cloud Services, SQL Server Database (Azure), created and maintained extensive
real-time weather website, with elements of Geographic Information Systems (GIS) design.
Successfully implemented and deployed systems to Microsoft Azure, including converting existing
ASP.NET projects to Azure solutions, and managing SQL Server Databases (Azure), enhancing the scalability
and reliability of applications.
Led the migration of internal applications (Wiki, Nagios, Jira, etc.) to Docker containers and Kubernetes clusters
on Google Cloud, improving application portability, ease of upgrades, and repeatable installation processes.
Conversion of many of our internal applications such as Wiki, Nagios, Jira, and Docker containers to
allow portability, ease of upgrade, and repeatable installation processes.
Environment: AWS EC2, Jenkins, S3, RDS Instances, VPCs, ELBs, EBS volumes, F5 Load balancers, Chef, Boto3,
Elastic IPs, JIRA, Linux, GIT, MongoDB, Docker, OwnCloud, Glacier, Azure, VirtualBox, Cloud Watch, Cloud Front.
Education:
Master of Science in Computer & Systems Engineering - University of Houston Jan 2014 - Dec 2015