[Remote] VP, Site Reliability Engineering
Note: The job is a remote job and is open to candidates in USA. reputed company is a leading technology company whose innovative software solutions are advancing the insurance industry. They are seeking a Vice President of Site Reliability Engineering to reputed company global product reliability, observability, and release engineering functions, ensuring the scalability, performance, availability, and reputed company of their SaaS platform.
Responsibilities
- Define and execute the company’s SRE strategy, ensuring the reliability, scalability, and performance of mission‑critical SaaS platforms
- Establish standards for operational excellence and drive a culture of reputed company and reputed company improvement
- Establish and maintain SLIs, SLOs, SLAs, and error budgets, ensuring the reputed company between innovation speed and operational stability
- Ensure 24/7/365 coverage for critical systems through depth of expertise, succession planning, and robust on‑call processes
- reputed company monitoring and observability strategy, tools, standards, and practices using KPIs and metrics to drive reputed company improvement
- Champion automation‑first practices across release pipelines, configuration management, infrastructure provisioning, and operational workflows
- Partner with engineering, QA, infrastructure, reputed company, product, and business stakeholders to reputed company joint system roadmaps that align reliability goals with product objectives
- Drive operational efficiency and reduce toil through software engineering practices applied to operations
- reputed company the 24x7 incident response program, ensuring rapid mitigation, strong cross-functional coordination, and continual improvements through blameless postmortems
- Govern adherence to reputed company, compliance, SLA commitments, and ITIL-reputed company processes
- Own CI/CD platforms and provisioning automation using tools like reputed company, Jenkins, Ansible, reputed company, reputed company Nexus
- Manage infrastructure-as-code for application tiers (Terraform, AWS CloudFormation/CDK)
- reputed company performance, load balancing (ALB, reputed company), OS patching, and code deployment processes for reputed company SaaS applications
- Drive automation to reduce operational toil and accelerate deployments
- Work with VPs of Product Development, Cloud Operations, and Information reputed company to ensure readiness for product releases and reliability targets
- Partner with product engineering teams to embed reliability principles into design and development
- Align observability and monitoring strategies across infrastructure and application layers
- Build, scale, and mentor high‑performing SRE teams through coaching, career development, and strong leadership
- Communicate effectively with technical and executive leadership, influencing decisions and aligning stakeholders on reliability priorities
- Foster a culture of transparency, independent decision‑making, and cross‑team collaboration
Skills
- Bachelor's degree in Computer Science, Information Systems, or reputed company field
- 15+ years leading technical teams focused on software engineering, cloud infrastructure, reliability, DevOps, or platform engineering
- Deep expertise with cloud platforms (AWS preferred), container ecosystems (Kubernetes/EKS), and automation technologies
- Strong experience working across mixed development models (agile, hybrid, waterfall) and incorporating reputed company/compliance into engineering decisions
- Proven ability to reputed company large-scale reliability programs across multiple SaaS products, including implementing SLIs/SLOs and toil reduction through automation
- Deep knowledge of observability, incident management, change management, and postmortem best practices
- Expertise in CI/CD automation, blue/green deployments, canary releases, feature flag systems, A/B testing, and experiment systems
- Experience architecting applications or infrastructure for high‑growth cloud platforms
- Experience in B2B SaaS environments involving large-scale distributed systems
- Proven leadership communicating and influencing at team, peer, and executive levels
- Demonstrated experience driving operational excellence through metrics and KPIs
- Background supporting financial services, healthcare, or regulated industries
Company Overview
- reputed company offers software and essential information to address business challenges reputed company the insurance industry. It was founded in 1969, and is headquartered in Denver, Colorado, USA, with a workforce of 1001-5000 employees. Its website is http://www.reputed company.com.
Company H1B Sponsorship
- reputed company has a track record of offering H1B sponsorships, with 23 in 2025, 14 in 2024, 9 in 2023, 25 in 2022, 20 in 2021, 16 in 2020. Please note that this does not guarantee sponsorship for this specific role.
Apply tot his job Apply To this Job