Job Description
Job Title: Senior DevOps & Site Reliability Engineer (SRE)
Experience: 5+ Years
Location: Remote
Job Summary
We are looking for an experienced and highly motivated
Senior DevOps & Site Reliability Engineer (SRE)
to design, automate, and optimize
hybrid and multi-cloud infrastructure
across
AWS and Azure
environments.
This role combines
DevOps engineering best practices
with
SRE principles
to ensure highly available, scalable, secure, and observable systems. The ideal candidate will have strong hands-on expertise in
cloud infrastructure, CI/CD automation, Infrastructure as Code (IaC), and observability , while actively contributing to operational excellence and production reliability.
Key Responsibilities
Infrastructure & Automation
Design, deploy, and manage
scalable, fault-tolerant infrastructure
on AWS and Azure.
Implement and maintain
Infrastructure as Code (IaC)
using
Terraform , AWS CloudFormation, or Azure ARM/Bicep.
Ensure infrastructure adheres to
AWS Well-Architected Framework
and
Azure Well-Architected Framework
best practices (Security, Reliability, Performance, Cost Optimization, Operational Excellence).
Build, maintain, and optimize
CI/CD pipelines
using
Jenkins, GitLab CI, AWS CodePipeline, and Azure DevOps .
Automate provisioning, configuration, and deployment to improve consistency and reduce manual effort.
Site Reliability Engineering & Observability
Implement and manage
monitoring, logging, and alerting
solutions using tools such as
Prometheus, Grafana, Datadog, CloudWatch, Azure Monitor, Splunk , etc.
Participate in
incident response , perform
root cause analysis (RCA) , and implement preventive measures to reduce
MTTR .
Continuously monitor system and application performance, identify bottlenecks, and drive improvements.
Define, measure, and track
SLIs, SLOs, and SLAs
to meet reliability and availability targets.
Security & Collaboration
Integrate
DevSecOps
practices into CI/CD pipelines.
Manage access controls and security policies using
AWS IAM, Azure AD, and RBAC .
Collaborate closely with
development, QA, and product teams
to support smooth releases and shared ownership of production systems.
Provide support for production issues and contribute to ongoing operational improvements.
Required Skills & Experience
5+ years of experience
in
DevOps, SRE, or Cloud Infrastructure Engineering
roles.
Strong hands-on experience with
both AWS and Azure
cloud platforms.
Solid understanding of
AWS services
(EC2, S3, RDS, VPC, Lambda, etc.) and
Azure services
(VMs, Blob Storage, Azure SQL DB, VNet, App Services, AKS, etc.).
Hands-on expertise with
Infrastructure as Code , preferably
Terraform .
Strong scripting skills using
Python, Bash, or Go .
Experience with
CI/CD tools
such as Jenkins, GitLab CI, Azure DevOps, and AWS CodePipeline.
Practical experience with
observability stacks
(monitoring, logging, alerting).
Working knowledge of
AWS and Azure Well-Architected Frameworks .
Preferred Qualifications
Cloud certifications such as
AWS Certified DevOps Engineer – Professional
or
Microsoft Certified: Azure DevOps Engineer Expert .
Experience with
containerization and orchestration
tools like
Docker, Kubernetes, EKS, and AKS .
Experience working in
hybrid or multi-cloud environments .