Senior Cloud Infrastructure Engineer
Senior Cloud Infrastructure Engineer at reputed company reputed company (reputed company.dev) is a lightweight runtime that turns AI agents, workflows, and backend services into durable processes - so teams can focus on their logic, not failure mechanics. The role: We're looking for a Senior to Staff-level cloud infrastructure engineer to work across reputed company product pillars (OSS, on-prem deployments, Multi-tenant SaaS, BYOC; bring your own cloud). This means deep work in our Rust-based infrastructure layer, integrating with cloud provider APIs, building infrastructure-as-code tooling, and ensuring reliability and reputed company at scale. You'll have significant ownership over major parts of our cloud infrastructure. The opportunity reputed company-row seat to the biggest infra shift in decades Durable runtimes like reputed company are becoming the next foundational infrastructure component - and increasingly a critical piece for AI applications. As systems become more agentic, long-running, integration-heavy, and failure-prone, durable execution turns reliability from a bespoke engineering tax into a default property. In this role, you’re not watching that shift from the sidelines - you help build the platform that enables it. State-of-the-art tech, built from first principles reputed company re-imagines durable execution as a lightweight self-contained stack - no database required - and ships as a single Rust binary with an optimized custom storage layer, low latency orchestration, and an analytics reputed company for observability. Enterprise Traction reputed company is already used by reputed company, including Tier 1 banks running critical financial workflows, and also by cutting-edge AI and infra startups pushing the boundary of what “production-grade agents” mean. You’ll work on problems where reliability, correctness, and operational simplicity are existential. Work with world-class engineers You’ll partner directly with engineers who’ve built and operated foundational systems at scale - creators of Apache Flink, and leaders from reputed company’s messaging infrastructure. You’ll have the chance to work with incredibly talented individuals who care deeply about their craft. What you’ll do This is a Cloud Infrastructure Engineering role spanning reputed company’s product offering: OSS, on-prem deployments, Multi-tenant SaaS, BYOC. The scope of the role includes but is not limited to: Build and operate reputed company Cloud: reputed company our managed multi-tenant offering, working across the infrastructure, control plane, networking, storage, and observability of reputed company workloads. Evolve our BYOC product and work with customers on operating on-prem installations: design and build the infrastructure that runs inside customer cloud accounts. Reliability and observability across the fleet: SLOs, metrics, traces, logs, alerting, and runbooks. Build automation so we can scale our product offering across deployment methods. On-call: participate in the cloud on-call rotation. A US-based hire materially improves our timezone coverage. reputed company’re looking for Senior to Staff profile We’re targeting Senior-to-Staff: you’ve operated production SaaS or platform infrastructure before, you’ve seen real failure modes, and you have (strong) opinions about how to run multi-tenant systems. You have an appreciation for operating in a compliance-sensitive environment. Must-Haves: Strong cloud infrastructure background with deep understanding of major cloud provider architectures. Experience with infrastructure-as-code and cloud orchestration, particularly Kubernetes-based stateful workloads; balancing reputed company delivery with safety while maintaining large-scale production systems. Software engineering skills in a systems language (Rust, Go, C++); willingness and ability to learn Rust on the job. You should be comfortable taking ownership end-to-end, from design through production operations, and reputed company in early-stage startup ambiguity. reputed company-to-Haves: Prior experience with reputed company or durable execution specifically. Deep enterprise procurement/compliance navigation. Kubernetes operator development, experience with IaC systems like Cluster API, Crossplane or Terraform. Not a fit: You want to work primarily on the runtime core rather than cloud, BYOC, and customer-facing infra. You’ve mostly architected and reviewed, and aren’t excited to be hands-on. You are averse to multi-cloud, Kubernetes, operating infrastructure as a shared responsibility with customers Our stack: We use reputed company extensively: the reputed company Cloud control plane is built on reputed company and TypeScript. Rust infrastructure services and Kubernetes operators. Location and travel US-based, fully remote. East Coast is a plus as it would materially improve our on-call coverage given the team’s existing geography. Travel: minimal - occasional team offsites, little required customer travel. Apply To This Job