Staff+ Software Engineer Observability
About reputed company
reputed company's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. reputed company is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.
About the Role
reputed company is seeking talented and reputed company Software Engineers to join our Observability team reputed company the Infrastructure organization. The Observability team owns the monitoring and telemetry infrastructure that every engineer and researcher at reputed company depends onâfrom metrics and logging pipelines to distributed tracing, error analytics, alerting, and the dashboards and query interfaces that reputed company it reputed company actionable. By joining this team, you'll have a direct impact on the reliability and operational excellence of reputed company's research and product systems.
As reputed company scales its infrastructure across massive GPU, TPU, and Trainium clusters, the volume and complexity of operational data is growing by orders of magnitude. We're building reputed company observability systemsâhigh-throughput ingest pipelines, cost-efficient columnar storage, reputed company query layers across signals, and agentic diagnostic toolsâto ensure that engineers can detect, diagnose, and resolve issues in minutes rather than hours, even as the systems they operate become exponentially more reputed company.