Job Description
About the https://www.linkedin.com/redir/suspicious-page?url=FoundingTeams%2eai
https://www.linkedin.com/redir/suspicious-page?url=FoundingTeams%2eai is an AI pre accelerator and talent platform. We help Founders hire engineers, designers and marketing talent for their AI startup.
About the Role
As our Lead DevOps Engineer, you’ll design, implement, and maintain scalable, reliable, and secure infrastructure and CI/CD pipelines. You’ll mentor a team of DevOps engineers, partner closely with development teams, and drive best practices around automation, monitoring, and incident response.
Key Responsibilities
Infrastructure Architecture:
Lead design of cloud-native (AWS/Azure/GCP) microservices and data-platform architectures.
Define Infrastructure-as-Code (IaC) standards using tools like Terraform or CloudFormation.
CI/CD & Automation:
Build and optimize end-to-end pipelines (Jenkins, GitLab CI, GitHub Actions).
Automate testing, packaging, and deployments for containerized applications (Docker, Kubernetes).
Monitoring & Observability:
Implement logging, metrics, tracing (Prometheus, Grafana, ELK/EFK stack, Datadog, New Relic).
Establish SLOs/SLIs, alerting, and runbooks for on-call rotations.
Security & Compliance:
Integrate security scanning (SAST/DAST) into pipelines (e.g. SonarQube, Snyk).
Enforce IAM best practices, secrets management (Vault, AWS KMS, Azure Key Vault).
Performance & Cost Optimization:
Right-size cloud resources, optimize autoscaling, and manage reserved instances.
Conduct regular architecture reviews to identify bottlenecks.
Leadership & Mentorship:
Coach and grow a team of DevOps/SRE engineers.
Drive cross-functional training on DevOps tooling and practices.
Incident Management:
Lead post-mortem analyses and continuous improvement of incident response processes.
Required Skills & Experience
5+ years
in DevOps, SRE, or infrastructure engineering roles
Cloud Expertise:
Deep hands-on experience with at least one major cloud provider (AWS, Azure, or GCP)
IaC & Configuration Management:
Terraform, AWS CloudFormation, or Pulumi
Ansible, Chef, or Puppet
CI/CD:
Jenkins, GitLab CI/CD, GitHub Actions, or CircleCI
Containerization & Orchestration:
Docker, Kubernetes (EKS/GKE/AKS)
Monitoring & Logging:
Prometheus/Grafana, ELK/EFK, Datadog, New Relic, or equivalent
Scripting & Automation:
Python, Go, Bash, or similar
Security Tools:
Familiarity with Vault, AWS IAM, security scanners (Snyk, Trivy)
Networking Fundamentals:
VPCs, load balancers, DNS, CDN, firewalls
Soft Skills:
Strong communication and collaboration across teams
Proven leadership: mentoring, code reviews, and guiding best practices
Nice-to-Haves
Database Operations:
Experience with managed databases (RDS, Cloud SQL, Cosmos DB) and NoSQL (DynamoDB, MongoDB)
Serverless & Edge:
Lambda/FaaS, CloudFront, Akamai, or Fastly
Observability Platforms:
OpenTelemetry, Jaeger, Splunk
Cost-Management:
AWS Cost Explorer, Azure Cost Management
Certifications:
AWS Certified DevOps Engineer, Certified Kubernetes Administrator (CKA), or equivalent