Home Job Listings Categories Locations

DevOps Manager

📍 India

Technology Deutsche Telekom Digital Labs

Job Description

Role Overview We are seeking a DevOps Engineering Manager to lead Cloud and Platform engineering for AI-first teams, operating at the intersection of large-scale containerized production systems and next-generation Agentic AI and LLM deployments. This role is responsible for building and operating highly Reliable, Secure, and Scalable platforms that support mature microservices-based workloads while enabling rapid experimentation and production rollout of Agentic AI systems. You will work closely with AI/ML, platform, and product teams across India and Europe to operationalize AI solutions at scale.

Key Responsibilities • Define and own the cloud and platform architecture for large-scale containerized microservices and Agentic AI / LLM workloads, ensuring scalability, reliability, and cost efficiency. • Lead CI/CD platform engineering, enabling automated build, test, security scanning, and deployment for backend services, React-based web applications, and mobile app backends • Enable production-grade AI platforms, supporting agent frameworks, vector databases, prompt pipelines, and inference • Define Infrastructure as code standards, cloud account structures, networking, and environment provisioning across AWS and secondary clouds. • Implement and enforce SRE practices: define SLIs/SLOs, error budgets, capacityand reliability targets, and lead incident response and post-incident reviews. • Ensure end-to-end observability across services and AI workloads, including logs, metrics, traces, model performance, and cost visibility • Embed security, compliance, and governance by design, including IAM, secrets management, network security, vulnerability management, and AI-specific controls. • Make informed build vs. buy decisions, evaluate emerging cloud and AI infrastructure technologies, and drive continuous platform modernization.

Must Have • 10+ years of experience in DevOps / Cloud / Platform Engineering, including people management and technical leadership • Deep hands-on expertise with AWS, with working exposure to GCP and Azure in multi-cloud or hybrid environments • Proven experience operating large-scale, production-grade containerized workloads, with strong understanding of high availability, fault tolerance, and capacity planning in global teams • Practical experience supporting AI/ML or LLM workloads in production environments • Strong expertise in Kubernetes and Docker, including cluster operations, workload isolation, ingress, service meshes, and deployment strategies • Advanced experience with ‘Infrastructure as Code’ for cloud provisioning, networking, security controls, and environment standardization across multiple stages • Solid understanding of observability and reliability engineering, including metrics, logging, tracing, alerting, and defining SLIs/SLOs for distributed systems and AI services • Hands-on exposure with cloud security and compliance practices, including IAM design, secrets management, vulnerability scanning, and secure deployment patterns—especially for AI platforms • Knowledge of cloud cost optimization (FinOps), especially for AI workloads • Background in strong product-based organizations solving real customer-facing problems

Leadership and Mindset • Strong AI-first mindset with curiosity and adaptability to turn rapid AI innovation in to stable production systems. • Strategic thinker with hands-on technical depth • Excellent communication and collaboration skills in global, distributed teams • Ownership-driven leader who builds accountable teams and fosters a culture of reliability, automation, and continuous improvement

Ready to Apply?

Don't miss this opportunity! Apply now and join our team.

Job Details

Posted Date: February 28, 2026
Job Type: Technology
Location: India
Company: Deutsche Telekom Digital Labs

Ready to Apply?

Don't miss this opportunity! Apply now and join our team.