Job Description

Role:

Principal Site Reliability Engineer Sector:

Online gaming Job Type:

Full-Time Contract type:

B2B contract Location:

100% Remote Salary range:

€100.000 – €130.000 Bonus:

15–20% (paid quarterly)

Before you apply, please note Contract type: B2B only On-call: 24/7 rotation — typically 1 week every 4 weeks

Our partner is a well-established online gaming technology company running a modern microservices-based SaaS platform. The system operates across multiple regions and handles massive transaction volumes every day, where reliability, speed, and stability are business-critical.

They’re growing their Platform function and hiring a Principal SRE to help keep production smooth, reduce risk, and continuously improve how the platform is monitored, deployed, and supported.

What you’ll be doing

Own day-to-day production health: alerts, checks, incident handling, and escalation when needed Participate in a 24/7 on-call rotation for critical SaaS events Keep clear documentation: incident notes, fixes applied, and prevention steps Build and refine monitoring across AWS EKS / Kubernetes Deploy and operate workloads using Terraform and Helm + Flux (GitOps) Improve resilience with automated checks, scripts, and preventative fixes Maintain and enhance infrastructure/deployment code Evaluate and roll out new tooling to strengthen the cloud platform Work closely with engineering and product teams to unblock issues fast Plan releases and updates with a “customer impact first” mindset Lead RCA / postmortems and drive actions that stop repeat incidents Investigate alerts and make sure follow-ups land with the right owners Handle environment-specific operational requests when required

What we are looking for

Strong hands-on experience with Kubernetes (deployments, scaling, troubleshooting) GitOps/configuration tooling experience: FluxCD or ArgoCD Solid incident ownership: RCA, postmortems, prevention actions Good knowledge of AWS, Terraform, Docker, CI/CD Strong observability experience Monitoring: Datadog / Prometheus / Grafana Logging: ELK or AWS CloudWatch Strong understanding of networking fundamentals and troubleshooting Proficiency in at least one scripting language (Python / Node.js / Go) Comfortable working with Git and modern collaboration workflows Familiarity with incident tools like PagerDuty / Opsgenie / VictorOps Mindset: ownership, proactiveness, persistence, and pride in running reliable systems

Perks & Benefits

Competitive compensation + annual performance/salary reviews Quarterly bonus (15–20%) with a clear, realistic structure Flexible schedule (outcomes over hours) Fully remote setup Medical insurance for you + 1 Support for life events + extended parental leave Paid learning, courses, and professional training

Ready to Apply?

Don't miss this opportunity! Apply now and join our team.

Apply Now

Job Details

Posted Date: February 28, 2026

Job Type: Construction

Location: Indonesia

Company: Concentric Recruitment

Ready to Apply?

Don't miss this opportunity! Apply now and join our team.

Apply Now

Senior Site Reliability Engineer

Job Description

Ready to Apply?

Job Details

Ready to Apply?