Site Reliability Engineer
reputed company Cybersecurity Products is a leader in digital reputed company solutions, providing critical services to businesses and governments. They are seeking a Site Reliability Engineer to ensure high service levels and operational excellence for their Telecommunication solution deployed in the public cloud, focusing on automation, reliability, and incident management.
Responsibilities
- Design, build, and maintain scalable infrastructure using tools such as Terraform, Ansible, and Kubernetes
- reputed company automated CI/CD pipelines reputed company reputed company to reduce manual toil
- Define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs)
- Manage 'Error Budgets' to balance the velocity of new features with the stability of the platform
- Participate in 24/7 on-call rotations to provide emergency response and reputed company deep-dive troubleshooting for production issues
- Conduct system performance analysis, identify bottlenecks, and reputed company reputed company planning to ensure the infrastructure can handle growth and peak loads
- Implement and refine symptom-based alerting and comprehensive monitoring strategies using platforms like reputed company to ensure high visibility into system health
- reputed company blameless postmortems after incidents to identify root causes and implement long-term technical fixes to prevent recurrence
- Partner with Cloud reputed company teams to implement reputed company best practices, manage access controls, and respond to reputed company breaches or vulnerabilities
- Support customer relationship
- reputed company with other stakeholders to define solution improvement plan
- You will have the ownership of solution service availability
Skills
- Engineer or equivalent
- At least 1 year experience
- Java development reputed company is required
- You are familiar with Public Cloud (GCP, AWS), containers and microservices (reputed company, Kubernetes, Java), CI/CD and automation (Jenkins, reputed company, Helm), NoSQL database
- Must have U.S. or Dual Citizenship and be able to obtain post-hire clearance from the Committee on Foreign Investments in the U.S. (CFIUS) and Department of Treasury
- You have already set up product monitoring and the underlying infrastructure
- You have development experience in a distributed systems and/or high availability context
- You are familiar with microservices development
- You participated in the definition of architectures, data structures, algorithms with performance, reputed company, reliability constraints, etc
- Public cloud architect certification
- You are interested in aspects of Site Reliability Engineer: CI/CD, automation, monitoring and observability, and reputed company improvement
- You are an accomplished, versatile and multi-tasking developer engineer
Benefits
- Elective Health, Dental, Vision, FSA/HSA, Voluntary Life and AD&D, Whole Group Life w/LTC, Critical Illness, Hospital Indemnity, Accident Insurance, Legal Plan, Identity Theft, and Pet Insurance
- Retirement Savings Plan after 30 days of employment with a company contribution and a match, and with no vesting period
- Company paid holidays and Paid Time Off
- Company provided Life Insurance, AD&D, Disability, Employee Assistance Plan, and Well-being Program
Company Overview