Job Description
Role:
Principal Site Reliability Engineer
Sector:
Online gaming
Job Type:
Full-Time
Contract type:
B2B contract
Location:
100% Remote
Salary range:
€100.000 – €130.000
Bonus:
15–20% (paid quarterly)
Before you apply, please note
Contract type: B2B only
On-call: 24/7 rotation — typically 1 week every 4 weeks
Our partner is a well-established online gaming technology company running a modern microservices-based SaaS platform. The system operates across multiple regions and handles massive transaction volumes every day, where reliability, speed, and stability are business-critical.
They’re growing their Platform function and hiring a Principal SRE to help keep production smooth, reduce risk, and continuously improve how the platform is monitored, deployed, and supported.
What you’ll be doing
Own day-to-day production health: alerts, checks, incident handling, and escalation when needed
Participate in a 24/7 on-call rotation for critical SaaS events
Keep clear documentation: incident notes, fixes applied, and prevention steps
Build and refine monitoring across AWS EKS / Kubernetes
Deploy and operate workloads using Terraform and Helm + Flux (GitOps)
Improve resilience with automated checks, scripts, and preventative fixes
Maintain and enhance infrastructure/deployment code
Evaluate and roll out new tooling to strengthen the cloud platform
Work closely with engineering and product teams to unblock issues fast
Plan releases and updates with a “customer impact first” mindset
Lead RCA / postmortems and drive actions that stop repeat incidents
Investigate alerts and make sure follow-ups land with the right owners
Handle environment-specific operational requests when required
What we are looking for
Strong hands-on experience with Kubernetes (deployments, scaling, troubleshooting)
GitOps/configuration tooling experience: FluxCD or ArgoCD
Solid incident ownership: RCA, postmortems, prevention actions
Good knowledge of AWS, Terraform, Docker, CI/CD
Strong observability experience
Monitoring: Datadog / Prometheus / Grafana
Logging: ELK or AWS CloudWatch
Strong understanding of networking fundamentals and troubleshooting
Proficiency in at least one scripting language (Python / Node.js / Go)
Comfortable working with Git and modern collaboration workflows
Familiarity with incident tools like PagerDuty / Opsgenie / VictorOps
Mindset: ownership, proactiveness, persistence, and pride in running reliable systems
Perks & Benefits
Competitive compensation + annual performance/salary reviews
Quarterly bonus (15–20%) with a clear, realistic structure
Flexible schedule (outcomes over hours)
Fully remote setup
Medical insurance for you + 1
Support for life events + extended parental leave
Paid learning, courses, and professional training
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.
Job Details
Posted Date:
February 28, 2026
Job Type:
Construction
Location:
Indonesia
Company:
Concentric Recruitment
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.