Job Description
Open Role || SRE with OpenShift || Hyderabad/Pune
Role: OpenShift & Site Reliability Engineering (SRE)
Job Description:
• Providing suggestion/consultation to client based on their requirement to setup OpenShift Platform (On-Premises or Cloud or PAAS) with resources size.
• Configuring the OCP post its deployment by client or DevOps Team.
• Ensure the OCP Cluster is resilience over SPoF
• Day 2 configurations such as CSI Blob driver installation on ARO cluster to consume azure blob storage
• ODF deployment and customizing or creating new Storage Classes based on Reclaim and VolumeBind policy requirement
• Integrating VMWare with OCP so that it can leverage the underlying hardware for cluster autoscaling and storage consumption
• Deploy Thanos setup to store Cluster and Workload metrics for longer duration as ARO monitoring has limitation
• Configure monitoring rules and alert managers to intimate cluster and application failure over Email or Ticketing tools
• Configure OADP backup tool and test Backup/Restore of Application to meet the RTO & RPO
• Ensure clusters comply with security standards and free from vulnerabilities
• Perform cluster upgrade on Regular intervals based on OCP’s EOL and Application Compatibility
• Automate OCP Day2 configurations using Ansible Playbooks
Maximo Application:
• Supporting IBM MAS deployment, configuration, troubleshooting, backup & restore validation
• Configure TLS certificates
• Amend network policies on Application namespace to communicate with Kub-Api to stop/start app services using OC commands via cronjob
• Troubleshoot issues during Tekton pipeline execution for IBM MAS Application Install/Upgrade activities.
Azure Platform:
• Azure storage management – Create & manage Blob/File storages based on the requirement of App and high availability (LRS or ZRS), enable cleanup policies to keep-up data retention for logs/metrics on blob storage, enable azure backup for file storages.
• Integrate ARO with Azure ARC to leverage Azure Monitoring and Log analytics for alerting
• Rotate Azure Service principal creds before they expire for ARO clusters
Responsibilities:
• Deploy and manage OpenShift(RHOCP 4.X) environment from scratch on Bare-Metal using Ansible scripts.
• Updating the inventories and deployment methods in ansible playbook as per the deployment ENV.
• Build and Manage Tanzu Kubernetes Grid on Bare-Metal VMWare platform.
• Multi Cluster administration – Management cluster for FCAPS components and Resource cluster for Application components
• OpenShift Cluster backup/restore using Trilio
• OCP Compliance fixing based on Compliance Operator and Prisma security tools
• Prepare Mirror Repository for OCP4 deployment on Restricted network
• 4G & 5G CU Application onboarding using helm charts on both RHOCP & VMWare-TKG.
• Cluster turning for Applications onboarding, such as ODF/OCS, Performance addon, Multus, SRIOV, NMState, Kubevirt, Quay and GitLab installation
• FCAPS installation for logging (Elasticsearch-Fluentd-Kibana), Certificate Management, Authentication (RH IDM)
• Working with AWS Services - EC2, S3, Route53, EBS, IAM, ELB, Cloud Watch, Auto Scaling, VPC.
• HLD Preparation for OCP/TKG Single node/ HA Setups.
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.
Job Details
Posted Date:
March 18, 2026
Job Type:
Construction
Location:
Hyderabad, India
Company:
ITMC Systems, Inc
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.