Descrição da Vaga
Role: Senior Software Engineer, Platform Reliability Operations
Location: LATAM (Remote)
Type: Contract
Responsibilities:
Analyze and improve system design to reduce failure modes and promote self-healing systems
Establish and maintain robust systems that facilitate observability, encompassing logging, monitoring, distributed tracing, alerting, and offline test tools.
Work with development partners to shape the architecture, design, and implementations of new and existing systems to enhance their reliability, performance, efficiency, and scalability
Ability to work both independently as well as part of a geographically dispersed yet integrated team.
Collaborate with service engineers to establish Service Level Agreements (SLAs) and Service Level Objectives (SLOs) for backend services.
Being able to identify the indications or cues that demonstrate the effectiveness of an application and having the knowledge to improve or repair its performance
Ability to assess options and suggest solutions when there is limited or unclear information. This position requires a level of comfort and assurance in dealing with uncertain situations.
Ability to work seamlessly within a team as well as manage individual tasks
Respond to emerging incidents, solve critical issues, and follow through with a plan for resolution or future mitigation
Act as an SME on the Engineering Operations team, partnering with backend services teams and application teams to overcome challenges across all the platforms where we stream our service
Required Skills:
5+ years’ experience in software development
Degree in Computer Science or related or equivalent work experience
You have solid engineering and coding skills, data structure knowledge, and the ability to write high-performance production-quality code.
Experience building service-oriented APIs and cloud services
Experience designing, implementing, and deploying microservices
Extremely technical hands-on server software experience
Proficient in Golang, and JavaScript, and quick to learn new languages.
Experience in the Linux environment and a good understanding of its fundamentals and internals: filesystems and modern memory management, threads, and processes, the user/kernel-space divide, etc.
A good understanding of large-scale distributed systems in practice, including multi-tier architectures, application security, monitoring, and storage systems.
Working knowledge of the TCP/IP stack, internet routing, and load balancing.
Grit, drive, and a deep feeling of ownership.
Bonus Points for Experience with the following:
Golang
Typescript
Kubernetes
Terraform
Open telemetry
eBPF
Datadog
Helm Charts
HLS video transcoding, distribution & playback
Experience designing, implementing, and running services in high demand high-traffic environments
Experience with high-availability services
Apply today or share profiles to
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.
Detalhes da Vaga
Data de Publicação:
February 19, 2026
Tipo de Vaga:
Tecnologia
Localização:
Brazil
Company:
GeorgiaTEK Systems Inc.
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.