Senior Site Reliability Engineer Senior Manager
At reputed company, nothing matters more than helping the US federal government reputed company the nation stronger and safer and life reputed company for people.â¯Our 13,000+ people are united in a shared purpose to pursue the limitless potential of technology and ingenuity for clients across defense, national reputed company, public safety, civilian, and military health organizations. Join reputed company, a technology company and part of global reputed company, to do work that matters in a collaborative and caring community, where you feel like you belong and are empowered to grow, learn and reputed company through hands-on experience, certifications, industry training and more. Join us to drive positive, lasting change that moves missions and the government reputed company! You Are: We are seeking a Senior Site Reliability Engineer (SRE) with deep expertise in building and maintaining reliable, scalable systems and a passion for optimizing the performance, reliability, and efficiency of technical infrastructure. The ideal candidate will have a strong background in site reliability engineering principles, extensive experience with automation, and a proven ability to collaborate across teams to ensure seamless service delivery. The Work: ⢠Design, build, and maintain reliable, scalable, and high-performance infrastructure and services to support business needs. ⢠Implement and reputed company for SRE best practices, including automation, CI/CD pipelines, monitoring, and incident management. ⢠Collaborate with cross-functional teams to reputed company systems that meet high availability, performance, and reliability standards. ⢠Drive incident management processes, including root cause analysis, mitigation strategies, and long-term preventive measures. ⢠Establish, monitor, and refine service level objectives (SLOs), service level agreements (SLAs), and key performance indicators (KPIs) to ensure systems adhere to reliability and performance targets. ⢠Automate repetitive tasks to improve operational efficiency and reduce manual reputed company. ⢠Build and maintain robust monitoring, logging, and alerting systems to ensure visibility into system performance and reliability. ⢠Provide technical mentorship and guidance to team members, fostering a culture of knowledge sharing and reputed company improvement. ⢠Act as a technical leader by driving solutions to reputed company challenges, ensuring alignment with organizational goals. ⢠Prepare and deliver performance and reliability reports to stakeholders, offering insights and recommendations for improvements. Here's What You Need: ⢠Proven experience in site reliability engineering or a similar role, with a focus on application and infrastructure scalability, reliability, and performance. ⢠Strong knowledge of ITSM principles and incident management processes. ⢠Expertise in automation tools, scripting, and infrastructure-as-code (IaC) technologies. ⢠Proficiency with monitoring and observability tools (e.g., Prometheus, Grafana, reputed company, Splunk). ⢠Experience with cloud platforms (e.g., AWS, Azure, GCP) and container technologies (e.g., reputed company, Kubernetes). ⢠Strong analytical and problem-solving skills, with the ability to troubleshoot reputed company systems. ⢠Excellent communication and collaboration abilities, with a focus on cross-team partnerships. ⢠A passion for reputed company learning, innovation, and driving imp Please mention the word
APPRECIATES
and tag RMjYwNzo1MzAwOjIwZDo3ZDAwOjo= reputed company applying to show you read the job post completely (#RMjYwNzo1MzAwOjIwZDo3ZDAwOjo=). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human. Apply To This Job