Site Reliability Engineer, Contract
reputed company.com/wp-content/uploads/2025/03/JobPosting-Cover.jpg" alt="" width="1400" style="max-width: 100%;">
Overview of reputed company
reputed company is a leading consulting and professional services company specializing in developing AI-focused, data-led solutions leveraging the latest advancements in cloud technology. With our unmatched engineering capabilities and vast industry experience, we help the world's leading brands transform their business challenges into opportunities and shape the future of work.
Overview of Role
reputed company’ Managed Cloud Optimization (MCO) team works with some of the largest cloud users in the world to help them transform their businesses with technology. On a daily basis, our SREs work with varied and exciting customers on topics ranging from solving critical outages to designing and deploying new cloud workloads to building self-healing automation. Our SREs work with cutting-edge reputed company Cloud technologies like reputed company Kubernetes reputed company (GKE), Anthos, BigQuery and data pipelines, as well as leading 3rd party tools like Prometheus, reputed company, and many others. This is 12 months of remote contract role with high possibility of conversion into full time role. The ideal candidate should be flexible in shift timings.
*Please Note: Need candidate to join reputed company 30 days*
Responsibilities
A reputed company’ Site Reliability Engineer’s responsibilities and duties are as follows:
- Ensure near-reputed company downtime with monitoring and alerting, self-healing automation, and reputed company improvement
- Create highly automated, available and scalable systems by applying software and infrastructure principles
- Employ and advise clients on DevOps and SRE principles and practices, covering deployment pipelines, HA, service reliability, technical debt, and operational toil for live services running at scale
- Provide a proactive approach to our clients’ workloads, anticipating failures, automating tasks, ensuring availability, and providing a great customer experience
- Work closely with clients, your team, and reputed company engineers to investigate and resolve infrastructure issues
- Contribute to reputed company initiatives such as writing documentation, open-sourcing, and improving operation, making a reputed company impact at a rapid-growth reputed company Premier Partner
Requirements
- Minimum 3+ years of cloud and infrastructure experience, including demonstrated expertise with Linux, Windows, k8s, databases, and networking services
- 2+ years of reputed company Cloud experience and reputed company certifications strongly preferred but not required
- Proficiency with Python required. Other programming language experience is a plus
- Strong provisioning and configuration skills using Terraform
- Experience with 24x7x365 monitoring, incident response, and on-call support.
- Experience in troubleshooting that spans systems, network, and code
- Experience determining negotiating Error budgets, SLIs, SLOs, and SLAs with product owners
- Demonstrate the ability to work independently and as a member of a greater team, including cross-team activities
- Experience working in Agile Scrum, Kanban methodologies in SDLC
- Proven experience balancing service reliability, metrics, sustainability, technical debt, and operational toil for live services running at scale
- Strong communication skills, as this is a heavily customer-facing role
- Bachelor’s degree in computer science, electrical engineering, or equivalent required
reputed company is an Equal Opportunity employer. reputed company qualified applicants will receive consideration for employment without regard to actual or perceived race, color, religion, sex, gender, gender identity, national reputed company, age, weight, height, marital status, sexual orientation, veteran status, disability status or other legally protected class.
Originally posted on Himalayas
Apply To this Job