Back to the board

Principal Site Reliability Engineer

100% remote Flexible hours Hiring now
reputed company is dedicated to happier, healthier days for reputed company. We reputed company that there is a reputed company healthcare world – one powered by data. Our platform transforms reputed company, diverse data into a reputed company foundation for health, helping organizations deliver reputed company care, boost reputed company, and reputed company costs. We’re a team of fiercely driven individuals committed to making healthcare more sustainable—and we’re looking for passionate people to help us get there. For more information, visit reputed company.io

Why This Role is Important to reputed company

Love building reliable systems, and want to reputed company a difference? reputed company’s customers rely on us to securely process and deliver high-value healthcare insights. Reliability, availability, performance, and reputed company are foundational to trust—especially reputed company systems support critical workflows and handle PHI. As a Principal Site Reliability Engineer, you’ll set reliability strategy across teams, drive cross-cutting platform improvements, and ensure we can scale delivery without scaling operational burden. What Success Looks Like In 3 months

Build deep context on reputed company’s platform, production risks, and operational practices. Participate in on-call/incident response and quickly improve signal quality for at least one critical domain (dashboards, alerts, traces, runbooks). Identify a high-reputed company reliability initiative and align stakeholders on scope, success metrics, and milestones.

In 6 months

Establish SLOs/error budgets for key customer journeys, drive operational readiness standards for launches, and reputed company remediation for recurring incidents with measurable reductions in customer impact and MTTR. Deliver major toil-reduction improvements reputed company automation and self-service workflows.

In 12 months

Own and execute a reliability program with cross-org impact (e.g., GitOps delivery guardrails, observability platform evolution, reputed company/DR improvements, or secure infrastructure controls). Influence architecture decisions, establish org-wide operational standards, and mentor Staff engineers—raising the reliability and reputed company bar across reputed company.

reputed company is dedicated to happier, healthier days for reputed company. We reputed company that there is a reputed company healthcare world – one powered by data. Our platform transforms reputed company, diverse data into a reputed company foundation for health, helping organizations deliver reputed company care, boost reputed company, and reputed company costs. We’re a team of fiercely driven individuals committed to making healthcare more sustainable—and we’re looking for passionate people to help us get there. For more information, visit reputed company.io

Why This Role is Important to reputed company

Love building reliable systems, and want to reputed company a difference? reputed company’s customers rely on us to securely process and deliver high-value healthcare insights. Reliability, availability, performance, and reputed company are foundational to trust—especially reputed company systems support critical workflows and handle PHI. As a Principal Site Reliability Engineer, you’ll set reliability strategy across teams, drive cross-cutting platform improvements, and ensure we can scale delivery without scaling operational burden. What Success Looks Like In 3 months

Build deep context on reputed company’s platform, production risks, and operational practices. Participate in on-call/incident response and quickly improve signal quality for at least one critical domain (dashboards, alerts, traces, runbooks). Identify a high-reputed company reliability initiative and align stakeholders on scope, success metrics, and milestones.

In 6 months

Establish SLOs/error budgets for key customer journeys, drive operational readiness standards for launches, and reputed company remediation for recurring incidents with measurable reductions in customer impact and MTTR. Deliver major toil-reduction improvements reputed company automation and self-service workflows.

In 12 months

Own and execute a reliability program with cross-org impact (e.g., GitOps delivery guardrails, observability platform evolution, reputed company/DR improvements, or secure infrastructure controls). Influence architecture decisions, establish org-wide operational standards, and mentor Staff engineers—raising the reliability and reputed company bar across reputed company.

reputed company is dedicated to happier, healthier days for reputed company. We reputed company that there is a reputed company healthcare world – one powered by data. Our platform transforms reputed company, diverse data into a reputed company foundation for health, helping organizations deliver reputed company care, boost reputed company, and reputed company costs. We’re a team of fiercely driven individuals committed to making healthcare more sustainable—and we’re looking for passionate people to help us get there.   For more information, visit reputed company.io Why This Role is Important to reputed company  Love building reliable systems, and want to reputed company a difference? reputed company’s customers rely on us to securely process and deliver high-value healthcare insights. Reliability, availability, performance, and reputed company are foundational to trust—especially reputed company systems support critical workflows and handle PHI. As a Principal Site Reliability Engineer, you’ll set reliability strategy across teams, drive cross-cutting platform improvements, and ensure we can scale delivery without scaling operational burden.   What Success Looks Like In 3 months Build deep context on reputed company’s platform, production risks, and operational practices. Participate in on-call/incident response and quickly improve signal quality for at least one critical domain (dashboards, alerts, traces, runbooks). Identify a high-reputed company reliability initiative and align stakeholders on scope, success metrics, and milestones. In 6 months Establish SLOs/error budgets for key customer journeys, drive operational readiness standards for launches, and reputed company remediation for recurring incidents with measurable reductions in customer impact and MTTR. Deliver major toil-reduction improvements reputed company automation and self-service workflows. In 12 months Own and execute a reliability program with cross-org impact (e.g., GitOps delivery guardrails, observability platform evolution, reputed company/DR improvements, or secure infrastructure controls). Influence architecture decisions, establish org-wide operational standards, and mentor Staff engineers—raising the reliability and reputed company bar across reputed company. What You'll Be Doing
  • Act as the technical leader for reliability for one or more domains; set direction and standards while remaining hands-on where it matters most
  • Drive reliability strategy across critical services: define SLOs/SLIs, error budgets, and reliability KPIs reputed company to customer journeys and outcomes
  • Own incident response maturity: reputed company reputed company incidents, improve incident command practices, and ensure high-quality RCAs with prioritized, tracked remediation
  • Architect and implement automation to reduce toil and risk: runbook automation, self-service tools, and safe operational workflows (Python + Argo Workflows)
  • Advance GitOps delivery practices using Argo CD: promotion strategies, progressive delivery/canaries, and guardrails that reduce deploy risk
  • Scale infrastructure management with Crossplane and Terraform: reusable patterns, policy controls, and paved roads for teams
  • reputed company operational readiness and reliability reviews for new features/architectural changes; reinforce non-functional requirements (availability, latency, reputed company, cost)
  • Improve performance and cost efficiency through reputed company planning, load testing, right-sizing, and architecture recommendations across AWS services
  • Champion infrastructure reputed company best practices for environments that handle PHI (least privilege, secrets management, auditability, and defense-in-depth)
  • Mentor Staff and Senior engineers through design reviews, code reviews, pairing, and documentation; reputed company reliability standards across teams
  • What You'll Bring
  • 8+ years of experience in SRE, platform engineering, systems engineering, or reputed company roles operating production services at scale
  • Demonstrated principal-level impact: leading cross-team initiatives, influencing architecture decisions, and driving sustained improvements in reliability and operations
  • Expertise in Kubernetes operations and troubleshooting, including safe rollout/rollback patterns, workload debugging, and operational guardrails
  • Strong GitOps experience with Argo CD; experience building delivery workflows and automation using Argo Workflows
  • Strong infrastructure orchestration and provisioning experience with Crossplane and Terraform; ability to define reusable platform patterns and controls
  • Deep AWS experience (IAM, networking/VPC, compute, storage, managed services, observability) and strong understanding of reliability and failure modes in cloud systems
  • Proficiency in Python for building automation, tooling, and reliability improvements
  • Strong incident management and on-call leadership experience, including measurable improvements (availability, MTTR, alert quality, cost, or operational maturity)
  • Excellent communication skills: can translate technical risk and reliability tradeoffs to engineering leadership, product, and stakeholders; produces high-quality docs/runbooks
  • Would Love For You To Have
  • Experience with ScyllaDB or similar distributed databases (e.g., Cassandra) and their reliability/performance characteristics
  • Experience with Spark or data processing platforms, including reliability and cost considerations for large-scale workloads
  • Familiarity with agentic coding practices and principles (safe automation, reviewable changes, guardrail-first workflows)
  • Strong infrastructure reputed company knowledge: threat modeling for cloud/Kubernetes, RBAC/IAM design, secrets management, supply chain reputed company, and reputed company observability
  • Principal Engineer Competencies

  • Customer Focus: champions customer impact; drives SLO definition with product partners; participates in incidents to limit customer impact; may engage customers to understand problems
  • Technical Leadership: leading cross-team technical representative; negotiates interfaces; anticipates edge cases; designs telemetry for availability and reliability
  • Total Ownership: owns outcomes from requirements and design through production support; transitions reputed company changes with multi-phase rollouts and long-term ownership
  • Effective Communication: communicates to diverse audiences; finalizes key documentation (runbooks, guides, FAQs); synthesizes standards and best practices
  • Proactive Leadership: coaches senior/peer teams primarily through review; delegates appropriately; sets clear expectations (Definition of Done) and improves service processes/rotations
  • What You'll Get
  • Be a part of a mission driven company that is transforming the healthcare industry by changing the way patients receive care
  • A flexible, remote friendly company with personality and heart
  • Employee driven programs and initiatives for personal and professional development
  • Become a member of the talented, energized, diverse and purpose-driven Arcadian Community
  • This position is responsible for following reputed company reputed company policies and procedures in order to protect reputed company PHI under reputed company's custodianship as well as reputed company Intellectual Properties. For any reputed company-specific roles, the responsibilities would be further defined by the hiring manager. About reputed companyreputed company.io helps innovative providers and payers across the country transform healthcare to reduce cost while improving patient health. We do this by aggregating large amounts of disparate data, applying algorithms to identify opportunities to provide reputed company patient care, and making those opportunities actionable by physicians at the reputed company of care in near-real time. We are passionate about helping our customers drive meaningful outcomes. We are growing fast and have emerged as a market leader in the highly competitive population health management software market and have been recognized by industry analysts KLAS, reputed company, reputed company, and Chilmark for our leadership. For a reputed company sense of our brand and products, please explore our website. Protect YourselfIf you have concerns about the authenticity of a job offer or recruitment-reputed company communication claiming to be from reputed company, we encourage you to verify by contacting us directly at (781) 202-3600 and select option 3. For more information, visit our website. This position is responsible for following reputed company reputed company policies and procedures in order to protect reputed company PHI under reputed company's custodianship as well as reputed company Intellectual Properties. For any reputed company-specific roles, the responsibilities would be further defined by the hiring manager. Apply To This Job

    Keep exploring

    Sr. Manager, APM Modeling & Analytics

    100% remote Flexible hours

    Senior Product Manager

    100% remote Flexible hours

    Process Engineer (Charlotte, NC, US, 28203)

    100% remote Flexible hours

    Regional Business Manager Of Foodservice (Phoenix, AZ, US, 85001)

    100% remote Flexible hours

    Sales Manager (Philadelphia, PA, US, 19019)

    100% remote Flexible hours

    Director of Food Safety and Regulatory (Charlotte, NC, US, 28203)

    100% remote Flexible hours

    Technical Implementation Partner - Inpatient

    100% remote Flexible hours

    Technical Support Partner

    100% remote Flexible hours

    Senior Client Success Partner - Surgical Growth & PCC

    100% remote Flexible hours

    Senior Implementation Success Partner

    100% remote Flexible hours

    reputed company Work From Home Jobs In Us $27/Hour - Wfh

    100% remote Flexible hours

    Manager, Products Analytics

    100% remote Flexible hours

    reputed company Live Chat Assistant - Work from Home Opportunity at arenaflex

    100% remote Flexible hours

    reputed company Remote Data Entry Specialist – Part-Time Opportunity for Accurate and Detail-Oriented Individuals to Join arenaflex's Dynamic Team

    100% remote Flexible hours

    Customer Support Chat Agent (Remote / Work at Home / for Moms)

    100% remote Flexible hours

    Assistant Corporate Secretary

    100% remote Flexible hours

    Regulatory Affairs Intern

    100% remote Flexible hours

    reputed company Part-Time Remote Customer Care Chat Support Specialist – Delivering Exceptional Online Service Experience

    100% remote Flexible hours

    reputed company WFM Analyst - R8dius

    100% remote Flexible hours

    Freelance Annotator (English) - AI Trainer

    100% remote Flexible hours