Descrição da Vaga

About You

This is a

high‑impact, high‑expectation senior DevOps/SRE role

supporting a Private Equity Group (PEG) platform. The team is new, the infrastructure is new, and the customer is extremely selective and private.

You must operate with

strong autonomy

, influence engineering decisions, challenge assumptions, and serve as a

thought partner

rather than an order‑taker. You will work directly with

VP‑level stakeholders

in a technical environment where reliability, deployment safety, and clarity in communication are essential.

This position requires someone who can

own DevOps architecture end‑to‑end

, enable safe and frequent deployments, and establish operational excellence from day zero.

Note:

This position is offered under a

contractor model for a period of 6 months

.

You Bring to Applaudo the Following Competencies Proven ownership of

production-grade CI/CD pipelines

using GitHub Actions reusable workflows and GitOps automation with

ArgoCD

. Expert-level

Kubernetes and EKS operations

, including node group management, Karpenter autoscaling, RBAC, PDBs, and topology constraints. Production-scale

Terraform expertise

, including module design,

S3 + DynamoDB remote state

, and PR-driven workflows via

Atlantis

. Strong

reliability engineering experience

, including SLO/SLI design, alerting strategies, dashboards, incident response, and post-incident reviews. Hands-on experience operating

HashiCorp Vault

, including auth backends, PKI, dynamic secrets, and audit logging. Experience implementing

supply-chain security controls

, including image scanning and signing, SBOM generation, and policy enforcement with OPA/Gatekeeper. Strong experience with

observability stacks

, including Prometheus, Grafana, Loki, Tempo, and Alertmanager. Experience with

service mesh technologies such as Istio

, including traffic management, mTLS, AuthorizationPolicies, and circuit breaking. Scripting ability using

Python and Bash

for automation and operational tooling. Active use of

AI-assisted engineering tools

such as Cursor, GitHub Copilot, or Cloud Code to accelerate IaC development, incident response, and runbook generation. Strong communication skills, with the ability to communicate clearly and confidently with

VP-level stakeholders

during operational incidents. Advanced English proficiency, as you will work directly with US-based clients.

You Will Be Accountable for the Following Responsibilities Design and maintain

GitHub Actions reusable workflows

across a multi-repository ecosystem. Own

GitOps deployments through ArgoCD

, including promotion workflows, sync policies, drift detection, and automated rollback strategies. Implement deployment safety mechanisms such as

environment protections, concurrency rules, and verification gates

. Operate and upgrade

EKS clusters

, including Karpenter provisioning, node groups, and critical cluster add-ons. Maintain

Terraform-driven infrastructure

and enforce PR-driven workflows through Atlantis. Define and maintain

SLOs, SLIs, alerting rules, and monitoring dashboards

across platform services. Lead

incident response

, coordinate recovery efforts, and execute structured post-incident reviews. Participate in an

on-call rotation

and contribute to improving operational processes. Operate and maintain

HashiCorp Vault

, including policies, authentication backends, and secret engines. Implement

supply-chain security controls

, including Trivy scanning, Cosign signing, SBOM generation, and OPA/Gatekeeper enforcement. Partner with Security Engineering on

network policies, egress controls, and compliance standards

. Automate repetitive tasks and maintain

proactive runbooks

to reduce operational risk. Use

AI tools

to improve infrastructure automation, documentation, and deployment safety validation. Collaborate with product teams to strengthen

SLOs and deployment safety practices

. Challenge technical assumptions and advocate for

scalable, secure DevOps architectures

.

Qualifications Proven end-to-end ownership of production-grade Kubernetes/EKS environments including Karpenter and Atlantis-driven Terraform workflows. Demonstrated expertise with ArgoCD GitOps patterns. Hands-on experience with HashiCorp Vault, supply-chain security controls, and structured incident response including on-call rotations and post-incident reviews. Active use of AI-assisted tools such as Cursor, GitHub Copilot, or Cloud Code as part of daily engineering workflow.

Additional Information About Us

We Are Engineered Different.

At Applaudo, talented people design, build, and scale meaningful, AI-powered solutions that create real business impact. As an AI-native organization, we collaborate across design, development, cloud, data, and artificial intelligence to turn ideas into scalable products that transform how companies operate, make decisions, and grow.

We are building a high-performance culture grounded in five values:

Empowering Excellence, Collaborative Teamwork, Unsolicited Respect, Consistent Transparency, and Efficient Communication

. These define how we work, how we support one another, and how we hold ourselves accountable.

Applaudo is a place for people who want to learn fast, take ownership, and work alongside strong teams they are proud to belong to. Joining us means being part of an organization that is evolving intentionally, investing in modern ways of working, and leading AI-native transformation at scale.

Ready to Apply?

Don't miss this opportunity! Apply now and join our team.

Candidatar Agora

Detalhes da Vaga

Data de Publicação: March 14, 2026

Tipo de Vaga: Construção

Localização: Brazil

Company: Applaudo

Ready to Apply?

Don't miss this opportunity! Apply now and join our team.

Candidatar Agora

Senior SRE Engineer (Temporary Contract)

Descrição da Vaga

Ready to Apply?

Detalhes da Vaga

Ready to Apply?