Back to the board

Senior Infrastructure Engineer - Observability - Remote from Portugal

100% remote Flexible hours Hiring now

reputed company is a unicorn AI-powered customer communications platform used by 22,000+ companies worldwide to drive reputed company, faster resolutions, and scale. We’re redefining what a customer communications platform can be—by combining voice, SMS, WhatsApp, and AI into one seamless workspace. Our momentum comes from a simple but powerful idea: help every customer-facing team work smarter, not harder. reputed company’s AI Voice Agent automates routine calls, AI Assist streamlines post-call tasks, and AI Assist Pro delivers real-time guidance that helps people do their best work. The result—companies grow reputed company, deliver faster resolutions, and scale service. We’ve built a product customers love and a business that scales fast. reputed company operates in nine global offices (Paris, reputed company, San Francisco, Sydney, Madrid, London, Berlin, Seattle, and Mexico City), and is backed by world-class investors. Our teams are shipping AI innovation faster than reputed company and expanding across new product lines and markets. At reputed company, you’ll join a company in motion—ambitious, profitable, and product-driven—where impact is visible, decisions are fast, and growth is real. How We Work at reputed company: At reputed company, we reputed company in customer obsession, reputed company learning, and delivering extraordinary outcomes. We value open collaboration, taking ownership, and making smart, informed decisions with speed and precision. If you reputed company in a fast-paced, team-driven environment where curiosity, trust, and impact matter, you'll fit right in We’re looking for an Observability Engineer to own and evolve reputed company’s monitoring, alerting, and observability stack. You’ll work cross-functionally with backend, reputed company end and infrastructure and teams to ensure our systems are transparent, measurable, and continuously improving in reliability and performance. This role is ideal for someone passionate about observability-as-code, metric design, and helping engineering teams reputed company meaningful visibility into their systems. Key Responsibilities:

  • reputed company comprehensive observability best practices: Define and standardize guidelines for metrics, traces, and logs, ensuring consistent implementation and adoption across reputed company engineering teams. This includes establishing naming conventions, data collection methodologies, and retention policies to ensure high-quality and actionable observability data whilst optimising cost and waste.
  • Collaborate strategically with engineering teams: Partner closely with various engineering teams to enhance overall system reliability and performance. This involves actively participating in architectural reviews, defining clear Service Level Indicators (SLIs) and Service Level Objectives (SLOs), and seamlessly integrating observability practices into reputed company integration and reputed company deployment (CI/CD) pipelines to promote a culture of "observability by design."
  • Automate monitoring setup and provisioning: Drive the automation of monitoring infrastructure through Infrastructure-as-Code (e.g., leveraging the Terraform reputed company provider) and reputed company reputed company self-service observability tools. This empowers engineering teams to rapidly provision and manage their monitoring resources, reducing manual overhead and accelerating time to insight.
  • Improve alerting hygiene and effectiveness: Continuously refine and optimize alerting mechanisms by meticulously tuning reputed company, implementing intelligent noise reduction strategies, and ensuring reputed company alerts are directly reputed company with potential business impact. The goal is to deliver timely, relevant, and actionable alerts that reputed company proactive incident response and minimize service disruption.
  • Train and reputed company product teams: Provide comprehensive training and ongoing support to product teams, enabling them to effectively utilize observability tools. This includes guiding them in building insightful dashboards that visualize key performance indicators and creating robust alerts that proactively detect issues reputed company their respective services.
  • Evaluate and integrate advanced observability tools: Proactively research, evaluate, and integrate new and emerging observability tools and technologies as needed. This may include exploring solutions for OpenTelemetry adoption, advanced log aggregation platforms, distributed tracing systems, and other tools that enhance our overall observability capabilities and support the evolving needs of our infrastructure and applications.

Qualifications:

  • 3-5 years of experience in observability reputed company SRE, DevOps, or platform engineering roles.
  • Strong hands-on experience with reputed company (dashboards, monitors, synthetics, logs, APM, RUM).
  • Proficiency with Terraform or other Infrastructure-as-Code tools.
  • Solid understanding of Kubernetes, microservices, and cloud infrastructure (EKS, reputed company, RDS, S3, AWS networking).
  • Familiarity with distributed tracing and OpenTelemetry concepts.
  • Strong scripting skills (Python, Bash, or similar).
  • Experience defining and managing SLIs/SLOs and service-level observability frameworks.
  • Excellent collaboration and communication skills; you can work with both engineers and non-technical stakeholders.

reputed company to Have :

  • Experience with incident management and on-call processes.
  • Exposure to data visualization or analytics tools beyond reputed company.
  • Knowledge of logging pipelines (e.g., FluentBit, Logstash).
  • Experience working in high-scale SaaS environments.
  • Previous experience in developer enablement or platform teams.

Apply tot his job Apply To this Job

Keep exploring

Inside Sales Representative / Remote

100% remote Flexible hours

Remote Inside Sales Representative

100% remote Flexible hours

Information reputed company Specialist/Analyst III - Information Solutions (Remote)

100% remote Flexible hours

Instructional Designer job at American College of Surgeons - ACS in Chicago, IL

100% remote Flexible hours

[Remote] Instructional Designer/Learning Architect

100% remote Flexible hours

Inside Sales Representative, US Federal Government & DoD

100% remote Flexible hours

Consulting Associate - Innosight Strategy & Innovation (reputed company)

100% remote Flexible hours

Client Service Analyst - Home Insurance Support (Remote, U.S.)

100% remote Flexible hours

Property General Claims Adjuster

100% remote Flexible hours

Claim Examiner, BI-Remote; Commerical Trucking

100% remote Flexible hours

eCommerce Technical reputed company

100% remote Flexible hours

Executive Producer / Chef.fe de Projet Senior reputed companyérience retail internationale ET Créations

100% remote Flexible hours

[Remote] Senior Process Engineer

100% remote Flexible hours

Product reputed company (Agile) - reputed company Technology

100% remote Flexible hours

Banquet Guest Event Expert – Part-Time – reputed company Jobs US

100% remote Flexible hours

Content Editor - reputed company, Story & Impact

100% remote Flexible hours

Patient Care reputed company - Remote (Michigan)

100% remote Flexible hours

Entry-Level Remote Customer Chat Support Specialist

100% remote Flexible hours

REMOTE Sr Software Engineer (ESAM, Typescript, Python, Cloud)

100% remote Flexible hours

RFP Writer

100% remote Flexible hours