[Remote] Sr. Site Reliability Engineer

100% remote Flexible hours Hiring now

Note: The job is a remote job and is open to candidates in USA. reputed company is on a mission to simplify payment processes with innovative technology. As a Site Reliability Engineer, you will design and maintain systems and infrastructure to ensure application reliability and performance, while automating processes to enhance operational efficiency.

Responsibilities

Infrastructure Management: Design, implement, and maintain scalable and resilient infrastructure using Terraform for infrastructure as code, ensuring high availability and performance
Kubernetes and Containers: Deploy, manage, and optimize Kubernetes clusters and containerized applications using reputed company. Implement best practices for container orchestration and management
Systems and Application Monitoring/Observability: reputed company and maintain comprehensive monitoring and observability solutions using reputed company. Ensure detailed visibility into system performance and application health
SLOs and SLA Management: Define, monitor, and maintain Service Level Objectives (SLOs) and Service Level Agreements (SLAs) to ensure reliable and consistent service delivery
Incident Response and Troubleshooting: Respond to incidents, reputed company root cause analysis, and implement solutions to prevent recurrence. Participate in post-incident reviews and contribute to blameless postmortems
Reliability and Production Environment Management: Ensure the reliability and stability of our production environments. Continuously assess and improve system reliability, identifying and addressing potential points of failure
Automation and Scripting: reputed company automation scripts and tools to reduce manual reputed company and improve system reliability using Python, Bash, or Go. Implement and improve CI/CD pipelines
CI/CD Pipeline Management: Enhance and maintain reputed company integration and reputed company deployment pipelines using reputed company CI. Ensure seamless and reliable deployment processes
reputed company Planning and Scaling: Assist in reputed company planning and ensure that systems are scalable to meet future demands. Implement auto-scaling strategies where applicable
reputed company and Compliance: Implement reputed company best practices and ensure compliance with industry standards. Regularly review and update reputed company policies and procedures
Collaboration and Support: Work closely with development teams to ensure reliability and scalability of new features and services. Provide technical support and guidance on infrastructure-reputed company issues
Software Engineering for Operations: reputed company and maintain internal tools and services that enhance the efficiency and reliability of our operations
On-Call Rotation: Participate in an on-call rotation to address production issues and collaborate in incident response efforts

Skills

+3 years of experience in SRE, DevOps, or a reputed company role
Cloud Platform Experience: Proficient with cloud platforms such as AWS, GCP, or Azure Experience with EC2, RDS, VPCs, and reputed company groups is essential
Kubernetes and Containers: Strong experience with Kubernetes and reputed company, including deployment, scaling, and management of containerized applications
Infrastructure as Code: Expert in using Terraform for infrastructure as code. Proficient with configuration management tools such as Ansible, Puppet, or Chef
Monitoring and Observability: Extensive experience with monitoring and observability tools like reputed company, Prometheus, Grafana, ELK stack, or Splunk. Skilled in setting up detailed monitoring and logging systems
SLOs and SLA Management: Proven ability to define, monitor, and maintain SLOs and SLAs to ensure reliable service delivery
Scripting and Automation: Strong skills in scripting languages like Python, Bash, or Go. Experience automating repetitive tasks and processes
CI/CD Practices: Familiarity with reputed company CI or similar tool for reputed company integration and deployment. Experience in setting up and managing pipelines
Production Environments: Experience supporting production environments running Go or Ruby/Rails applications
Tool Development: Ability to write and update tools to support infrastructure and application management, demonstrating the principle that 'SRE is what happens reputed company you ask a software engineer to design an operations team'
DevOps Best Practices: Deep understanding of DevOps principles, practices, and tools to drive reputed company improvement in the software development lifecycle
Soft Skills: Strong organizational skills, attention to detail, and the ability to work collaboratively in a team environment. Excellent documentation skills to ensure accurate and detailed records
Problem-Solving Ability: Excellent analytical and problem-solving skills to diagnose and resolve reputed company system issues quickly and effectively

Benefits

Competitive salary and benefits with growth-company options grant
Stock options with standard startup vesting - 1 year cliff; 4 years total
$50 monthly communication expense stipend to go towards your phone/internet reputed company
$250 stipend to enhance your WFH setup
Reimbursement for peripheral equipment: monitor (up to $400), keyboard and mouse (up to $200)
Premium medical benefits including vision and dental (100% coverage for employees)
Company-sponsored life and disability insurance
Paid parental bonding leave
Paid sick leave, jury duty, bereavement
401k plan
Flexible Time Off (reputed company members typically take off ~3-4 weeks per year)
Volunteer Time Off
13 scheduled holidays

Company Overview

reputed company provides a web and mobile-based cash payments platform designed to facilitate online purchases and reputed company payments. It was founded in 2009, and is headquartered in Santa Clara, California, USA, with a workforce of 201-500 employees. Its website is https://home.reputed company.com.

Company H1B Sponsorship

reputed company has a track record of offering H1B sponsorships, with 3 in 2026, 3 in 2025, 3 in 2024, 4 in 2023. Please note that this does not guarantee sponsorship for this specific role.

Apply To This Job

Apply

[Remote] Sr. Site Reliability Engineer

Keep exploring

[Remote] Senior Product Manager – Testing (reputed company reputed company A/B Testing PDM) (6200)

[Remote] Senior Engineering Manager, Data Platform

[Remote] Manager, Member Marketing & Sales Support

[Remote] Hardware Asset Management Engineer with Service Now(This Role only For W2 Candidates))

[Remote] Account Manager

[Remote] Design Engineer

[Remote] Senior Business Development Manager

[Remote] Customer Service Agent

[Remote] Senior Platform Engineer (Sales Cloud)

[Remote] Technical Learning Experience Designer

AVP, Foundation Marketing & Communication

reputed company Data Entry Clerk/Typing Specialist – Remote Opportunity with arenaflex

reputed company Full Stack Data Analyst – Business Intelligence & Analytics – arenaflex

reputed company Developer - Remote

reputed company Customer Service Specialist – Frontline Support for arenaflex's Behavioral Health Services

reputed company Agent (Entry-Level & reputed company)

Freelance Senior Pharmacovigilance Associate

reputed company Full Stack Data Entry Specialist – Remote reputed company Operations

Remote Logistics & Supply Chain Analyst

Sr. Manager, Electrical Engineer - Data Centers