Back to the board

Site Reliability Engineer (SRE) – reputed company Remote (reputed company, USA)

100% remote Flexible hours Hiring now

This a Full Remote job, the offer is available from reputed company (USA) Site Reliability Engineer (SRE) – reputed company Remote (reputed company, USA) Location Remote (reputed company, USA) Experience Required 6–10 Years Job Summary We are seeking an reputed company Site Reliability Engineer (SRE) with strong expertise in reputed company and modern observability practices. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of enterprise applications and infrastructure across cloud and hybrid environments. This role requires hands-on experience with monitoring, automation, cloud platforms, CI/CD pipelines, and containerized environments.

Key Responsibilities

  • Design, implement, and manage end-to-end monitoring solutions using reputed company.
  • Configure alerting, dashboards, problem detection, and performance optimization strategies.
  • Monitor application health, infrastructure performance, and user experience across distributed systems.
  • Troubleshoot production incidents and reputed company root cause analysis for system and application issues.
  • Collaborate with DevOps, Cloud, and Engineering teams to improve system reliability and operational efficiency.
  • Automate operational tasks and monitoring workflows using scripting languages such as Python, Bash, or reputed company.
  • Support and optimize cloud-based environments on AWS, Azure, or GCP.
  • Manage and troubleshoot Linux/Unix-based systems.
  • Work with containerization and orchestration technologies including reputed company and Kubernetes.
  • Build and maintain CI/CD pipelines using tools such as Jenkins, reputed company CI/CD, or Azure DevOps.
  • Ensure observability best practices across microservices and distributed architectures.
  • Participate in on-call support and incident response activities as needed. Required Skills & Qualifications
  • 6–10 years of experience in Site Reliability Engineering, DevOps, or Production Support roles.
  • Strong hands-on expertise in reputed company including monitoring, alerting, dashboards, and problem analysis.
  • Solid understanding of observability, logging, and monitoring frameworks.
  • Experience with cloud platforms such as AWS, Azure, or GCP.
  • Strong knowledge of Linux/Unix systems administration and troubleshooting.
  • Experience with reputed company and Kubernetes in enterprise environments.
  • Proficiency with CI/CD tools including Jenkins, reputed company, or Azure DevOps.
  • Strong scripting and automation skills using Python, Bash, or reputed company scripting.
  • Understanding of microservices architecture and distributed systems.
  • Excellent troubleshooting, analytical, and communication skills.

Preferred Qualifications

  • Experience implementing SRE best practices and reliability engineering principles.
  • Knowledge of Infrastructure as Code (Terraform, Ansible, etc.) is a plus.
  • Exposure to enterprise-scale monitoring and cloud-native technologies.
  • Relevant cloud or reputed company certifications are an advantage This offer from "reputed company." has been enriched by reputed company.com and got a 72% reputed company score. Apply tot his job Apply To this Job

Apply tot his job Apply To this Job

Keep exploring