Job Description
About OffSec
Founded in 2006 by the creators of Kali Linux, OffSec (formerly known as Offensive Security) is the leading provider of continuous professional and workforce development, training, and education for cybersecurity practitioners. OffSec’s distinct pedagogy and practical, hands‑on learning help organizations fill the infosec talent gap by training their teams on today’s most critical skills.
Become a part of our global presence
With team members in over 40 countries, we believe in inspiring people of all backgrounds and communities. The OffSec team is composed of diverse, internationally published authors, conference speakers, and seasoned information technology professionals from both the private sector and governments worldwide.
Excited about our mission and what we do? Apply and join us!
About the Job
OffSec is seeking an experienced Senior SRE to join our team and lead the design and implementation of complex, scalable lab environments that power our industry‑leading cybersecurity training and certification programs. This senior‑level position will work closely with Security Researchers and Platform Architects to architect sophisticated labs and vulnerable machine environments across hybrid cloud and on‑premises infrastructure, enabling hands‑on learning experiences for cybersecurity professionals worldwide.
The ideal candidate will bring deep expertise in OpenStack and modern SRE practices, with proven experience in large‑scale infrastructure migrations and cost optimization. You’ll design resilient, scalable, and secure infrastructure that deploys lab environments supporting thousands of concurrent users while maintaining the realistic attack scenarios our students depend on.
Duties and Responsibilities
Design and architect complex global data centers for labs supporting vulnerable machines and realistic attack scenarios using OpenStack
Develop scalable infrastructure solutions across hybrid cloud and on‑premises environments
Design secure hosting networks and network topologies that can be used to support realistic offensive cyber activities.
Establish infrastructure standards, patterns, and best practices for lab environment deployment
Create architectural solutions that reduce infrastructure costs while improving capabilities and performance
Lab Environment Specialization
Implement network isolation for thousands of concurrent user lab instances
Optimize lab deployment speed and resource utilization for peak performance
Create infrastructure supporting the deployment of concurrent vulnerable machine instances at scale
Design workspace‑based deployment models enabling team collaboration and private lab sessions
Strategic Technical Leadership & Collaboration
Partner closely with Lead Platform and Content Engineers to proactively identify and solve infrastructure requirements
Provide strategic technical guidance and mentorship to development and operations teams
Lead architectural reviews and challenge requirements to propose optimal technical solutions
Drive adoption of infrastructure‑as‑code and automated deployment practices
Identify process improvements and optimization opportunities before being asked
Develop infrastructure automation using known Infrastructure as Code frameworks
Create self‑service capabilities for Content Engineers to deploy and manage lab resources efficiently
Implement comprehensive monitoring, logging, and observability solutions for lab environments
Establish disaster recovery and business continuity procedures with minimal downtime requirements
Automate repetitive tasks to help reduce toil
Optimize application and infrastructure performance through automation and tuning
Write runbooks to automate repetitive tasks using Ansible and Terraform
Serve as a knowledge resource for the rest of the team on Ansible and Terraform
Evaluate new and emerging products, technologies and make recommendations concerning the introduction of new technologies
Conduct ongoing research into relevant technology stacks and architectural patterns, assessing their potential impact and value for internal use
Assist in monitoring performance to address errors and bottlenecks
Respond to and resolve infrastructure incidents and outages
Participate in on‑call rotations to ensure service reliability
Design complex network architectures including VPNs, VLANs, and software‑defined networking
Implement network segmentation and security controls appropriate for vulnerable lab environments
Configure and manage load balancers, firewalls, and network security appliances
Design network monitoring and traffic analysis capabilities
Ensure proper isolation between student lab environments while maintaining performance
Qualifications
Technical Expertise
OpenStack : Production experience with OpenStack deployment, management, and optimization
Cloud Platforms : 5+ years hands‑on experience with AWS, Azure, and Google Cloud Platform
Virtualization : Expert‑level knowledge of OpenStack
Networking : Deep understanding of TCP/IP, routing protocols, VPNs, firewalls, and network security
Infrastructure as Code : Proficiency with any framework like Terraform, CloudFormation, ARM templates, and configuration management tools
Containerization : Experience with Docker, Kubernetes or other container orchestration
Operating Systems : Advanced knowledge of Linux and Windows Server
Professional Experience
4+ years
of experience in SRE, Site Reliability Engineering, or Infrastructure Architecture roles
2+ years
in a senior or lead technical role with architectural responsibilities
Proven track record
of designing and implementing large‑scale, distributed systems
Demonstrated experience
with infrastructure cost optimization and migration projects
Experience
with high‑availability and disaster recovery implementations
Background
in cybersecurity, penetration testing, or vulnerability research environments (preferred but not a requirement)
Strategic & Analytical Skills
Proactive problem‑solving with ability to understand broader context and implications
Experience identifying and proposing solutions before being asked
Ability to challenge requirements and suggest alternative approaches
Track record of improving processes and identifying optimization opportunities
Experience making architectural decisions with incomplete information
Performance Expectations
Ability to work independently with minimal supervision while maintaining high quality standards
Track record of successful large‑scale infrastructure projects delivered on time
Experience mentoring team members and driving technical standards adoption
Leadership & Communication
Strong project management and technical leadership skills
Excellent communication abilities with both technical and non‑technical stakeholders
Experience mentoring junior engineers and driving technical standards
Ability to translate business requirements into cost‑effective technical solutions
Preferred Qualifications
Experience supporting systems through a SDLC – Dev, Staging, Prod workflows
Experience with cybersecurity tools and vulnerable application deployment
Experience with monitoring tools like Prometheus, Grafana, ELK stack
Familiarity with Offensive Security’s training platforms and methodologies
Open source contributions or technical writing experience
This role is a full‑time position. Work hours for this position are flexible and will be performed from a home office.
This position has no direct reports.
OffSec provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.
#J-18808-Ljbffr