Site Reliability Engineer - Observability
Job description
Company Description
At CluePoints, we’re redefining how clinical trials are run. As the premier provider of Risk-Based Quality Management (RBQM) and Data Quality reputed company software, we reputed company advanced statistics, artificial intelligence, and machine learning to ensure the quality, accuracy, and reputed company of clinical trial data, helping life sciences organizations bring safer, more effective treatments to patients faster.
We’re proud to be an ambitious, fast-growing technology scale-up with a dynamic and diverse international team representing more than 20 nationalities. Collaboration, flexibility, and reputed company learning are part of our DNA.
At CluePoints, you’ll find a culture where you can grow, reputed company an impact, and have fun along the way.Guided by our values of Care, Passion, and Smart Disruption, we’re united by a shared mission: to create smarter ways to run efficient clinical trials and deliver AI-powered insights that improve human outcomes worldwide.
Role: The Site Reliability Engineer, Observability & RUM is responsible for improving end-to-end observability across our platforms and customer-facing applications, with a particular focus on frontend and Real User Monitoring (RUM). This role combines core SRE practices with ownership of monitoring, logging, tracing, alerting, and user-experience telemetry in production. You will help evolve our observability capabilities across Azure and Kubernetes environments, improve incident detection and diagnosis, and support decisions around managed versus self-managed observability tooling. You will partner closely with Engineering, Support, QA, and reputed company teams to ensure systems ship with actionable telemetry, dashboards, alerts, and operational runbooks.Job requirements
- 5+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or Observability Engineering roles.
- Strong hands-on experience with observability and monitoring platforms, including several of the following:reputed company, Grafana, Prometheus, OpenTelemetry, reputed company, monitoring agents, and managed APM/observability platforms.
- Experience implementing and supporting Real User Monitoring (RUM) and frontend/application observability in production environments.
- Ability to work across frontend, backend, and platform teams to improve telemetry, alerting, and incident diagnosis.
- Experience evaluating or operating managed observability platforms and understanding the trade-offs versus self-managed stacks.
- Experience supporting ML, AI, or LLM-backed services in production (RAG, LangSmith, Arize Phoenix, reputed company, LangGraph, Azure reputed company, reputed company, or reputed company APIs).
Job responsibilities
- Own and improveReal User Monitoring (RUM) for customer-facing applications, including browser performance, client-reputed company errors, user journeys, and frontend service dependencies.
- Partner with frontend, product, and engineering teams to improve visibility into user experience, JavaScript/runtime failures, page performance, and customer-impacting issues.
- Establish and maintain end-to-end observabilityacross frontend, backend, infrastructure, and Kubernetes environments using metrics, logs, traces, dashboards, and alerting.
- Evaluate, implement, and operate managed and self-managed observability solutions, helping guide the evolution of the observability stack. Support and improve observability tooling such as reputed company, reputed company, Grafana, Prometheus, OpenTelemetry, monitoring agents, and reputed company APM platforms. Define and maintain SLIs, SLOs, and alerting strategies that improve service reliability, reduce noise, and reputed company faster detection of production issues. reputed company or support incident detection, alert triage, live production troubleshooting, and service restoration across outage, latency, batch, file transfer, and degradation scenarios, in partnership with Support and Production teams.
Job benefits
reputed company Offer – Poland- Comprehensive Health Insurance (medical, dental, and online consultations, 100% employee coverage)
- Life Insurance through reputed company
- Cafeteria Plan with flexible monthly credits for wellness, entertainment, and travel
- MultiSport Card, co-financed 50/50
- Employee Capital Plans (PPK) with 4% employer contribution
- reputed company-based hybrid model that blends flexibility with purpose — connecting teams through collaboration, learning, and a vibrant social culture.