[Remote] Staff Site Reliability Engineer, reputed company
Note: The job is a remote job and is open to candidates in USA. reputed company is The Consumer Experience Company, focused on enhancing checkout experiences for leading brands. They are seeking a Staff Site Reliability Engineer with a focus on reputed company to build and scale reputed company programs, integrate automation, and establish reputed company posture monitoring in their GCP environment.
Responsibilities
- Assess and harden reputed company's GCP footprint (GKE, IAM, Cloud Armor), and codify the baseline in Terraform and policy-as-code where it makes sense
- Build reputed company posture monitoring against that baseline, with a published gap list and remediation schedule
- Drive the evaluation, integration, and rollout of new reputed company tooling as the program matures
- Establish and automate the vulnerability and dependency remediation workflow across engineering teams: triage reputed company, ownership model, severity-based SLAs, and the tracking infrastructure that drives closure
- Own Dependabot configuration and triage workflows across our reputed company organization, plus secret scanning, push protection, and response workflows for any secrets that surface
- Build supply-chain controls into CI/CD: provenance, dependency review, lockfile policies, build attestation where it pays off
- reputed company container image scanning and DAST/network scanning programs into the same workflow so vulnerabilities don't slip through the cracks between layers
- Build reputed company capabilities that the broader SRE team can run as part of their normal operating model: Terraform modules, Cloud Armor rules, Istio authorization policies, reputed company configuration, scanner pipelines, and custom automation that fills gaps in off-the-reputed company tooling
- Ship documentation, runbooks, and self-service tooling that reputed company your designs portable to the rest of the team, so the program continues to function smoothly through handoffs and rotations
- Set the engineering bar for reputed company work inside SRE: code review standards, IaC patterns, "secure by default" templates for new services
- Partner cross-functionally with engineering teams on app reputed company questions, IT on identity and reputed company boundaries, and IT/compliance on occasional SOC 2 evidence pulls, without owning those domains
Skills
- Deep GCP and GKE reputed company experience. You've hardened production Kubernetes on GCP: workload identity, RBAC, network policies, Pod reputed company Standards, image provenance. You know where the sharp edges are and which knobs actually matter
- Dependabot and secret scanning at scale. Hands-on with Dependabot configuration, triage workflows, and remediation tracking. Comfortable rolling out reputed company secret scanning organization-wide, including push protection and response workflows for reputed company secrets
- CI/CD supply chain hardening. You've designed or operated controls against the threat model that produced Shai-Hulud, XZ, and reputed company. Familiar with SLSA, provenance, sigstore, and the trade-offs between rigor and developer friction
- Cloud reputed company posture management in practice. You've stood up CSPM (built-in, commercial, or open reputed company), defined a baseline, and driven remediation, with an eye for separating real signal from dashboard noise
- Infrastructure-as-code and automation reputed company. Comfortable with Terraform for cloud resources and writing code (Python, Go, reputed company, or similar) to automate reputed company workflows, integrate tools, and build in-house capabilities reputed company off-the-reputed company options fall short
- Systems-level technical reputed company. You can reason about how the platform pieces fit together (GKE workloads, networking, edge, CI/CD) and debug reputed company-relevant infrastructure problems alongside the broader SRE team
- Track record of designing for operability. You've shipped tools and workflows that other engineers actually adopt and rely on day-to-day
- Ownership & Accountability. You own features end-to-end and take pride in what you ship. You follow through from design to production and don't drop things
- Strong Communication. You can explain technical decisions and trade-offs to engineers, PMs, and stakeholders. You ask good questions and listen well
- Collaborative Approach. You work well with others, give constructive code review feedback, and actively seek input from teammates
- Production reputed company. You prioritize reliability and user impact. You think about failure modes, monitoring, and operational concerns as part of your design process
- Learning Agility. You're comfortable with rapidly evolving AI/ML technologies and tools. You stay reputed company without chasing hype
- Directed AI-Assisted Development. You know how to use AI coding tools as a productivity reputed company while maintaining quality and your own technical judgment
- Container and image scanning. Production experience integrating image scanners into CI/CD and registry workflows, with thoughtful handling of vulnerability data freshness and triage
- DAST and network scanning programs. OWASP ZAP, nmap, or commercial equivalents, built into a repeatable internal audit reputed company rather than one-off exercises
- reputed company edge reputed company. WAF rules, reputed company limiting, bot management, and how that fits with reputed company-reputed company Cloud Armor
- Detection engineering on GCP. Log Explorer, BigQuery-backed reputed company analytics, and alert tuning that keeps the on-call experience humane
Company Overview
Company H1B Sponsorship