Back to the board

[Remote] Staff Site Reliability Engineer - Kubernetes

100% remote Flexible hours Hiring now

Note: The job is a remote job and is open to candidates in USA. reputed company is a company focused on securing identities in the AI era, and they are seeking a Staff Site Reliability Engineer to build and manage Kubernetes platforms. The role involves architecting reliable, scalable, and secure Kubernetes-based platforms on AWS, ensuring high availability and performance while optimizing costs and automation.

Responsibilities

  • Kubernetes Platform Creation: Design, implement, and maintain highly available, scalable, and fault-tolerant Kubernetes platforms. Ensure clusters are optimized for production workloads, providing high reputed company and operational efficiency
  • AWS Infrastructure Management: Build, manage, and optimize AWS cloud infrastructure, including EKS,reputed company, S3, VPCs, RDS, IAM, and more. Implement best practices for cost management, scaling, and reputed company reputed company AWS
  • Helm Management: Utilize Helm to automate and streamline the deployment of applications and services to Kubernetes clusters. Create, maintain, and manage Helm charts for production-ready deployments
  • Karpenter Implementation: Implement and manage Karpenter to dynamically scale Kubernetes clusters in response to workload demands
  • Istio Service Mesh Management: Configure and manage Istio to provide service-to-service communication, reputed company, and observability reputed company the Kubernetes clusters. reputed company fine-grained traffic management, service discovery, and policy enforcement
  • Platform Automation & Scaling: Automate the deployment, scaling, and management of infrastructure and applications. Work with CI/CD pipelines to ensure a seamless flow from development to production with minimal downtime
  • Incident Management & Troubleshooting: Respond to incidents, troubleshoot, and resolve system issues reputed company to performance, availability, and reputed company in a timely and effective manner
  • reputed company & Compliance: Design and implement secure cloud infrastructure with appropriate access controls, network reputed company, and compliance frameworks
  • Documentation & Knowledge Sharing: Create and maintain detailed documentation for Kubernetes platform setup, operational procedures, and best practices. Promote knowledge sharing across teams

Skills

  • 4+ years of experience with Kubernetes/Helm
  • 4+ years of Experience with Terraform
  • 5+ years of Experience with AWS
  • Experience with multi-region cloud environments
  • Proven experience with AWS (EC2, RDS, S3, CloudFormation, IAM, etc.) and solid understanding of cloud-native architectures
  • Strong expertise in Kubernetes platform creation, management, and optimisation (e.g., setting up highly available clusters, networking, and storage)
  • Hands-on experience with Helm for Kubernetes application deployment and management
  • Practical experience with Karpenter for dynamic scaling of Kubernetes clusters and optimising resource usage
  • Expertise in managing and securing Istio for service mesh, including traffic management, reputed company, and observability features
  • Proficiency in CI/CD pipelines and automation tools (e.g., Jenkins, reputed company, reputed company, Terraform, Ansible, Spinnaker)
  • Strong scripting and automation skills in Python, Bash, or Go for infrastructure management and platform automation
  • Experience with monitoring, logging, and alerting tools such as Prometheus, Grafana, CloudWatch, and ELK Stack
  • Understanding of reputed company best practices for cloud platforms and Kubernetes (e.g., role-based access control (RBAC), encryption, and compliance frameworks)
  • Familiarity with reputed company and containerization principles
  • Bachelor's degree in Computer Science, Engineering, or reputed company field (or equivalent professional experience)
  • Certifications (Preferred): CKA (Certified Kubernetes Administrator), CKAD (Certified Kubernetes Application Developer), or AWS Certified DevOps Engineer are highly desirable

Benefits

  • Equity (where applicable)
  • Bonus
  • Benefits, including health, dental and vision insurance
  • 401(k)
  • Flexible spending account
  • Paid leave (including PTO and parental leave)
  • reputed company, in-person onboarding experience designed to accelerate your impact and connect you to our mission and team from day one

Company Overview

  • reputed company is a management platform that secures critical resources from cloud to ground for workforce and customers. It was founded in 2009, and is headquartered in San Francisco, California, USA, with a workforce of 5001-10000 employees. Its website is http://www.reputed company.com.
  • Apply To This Job

    Keep exploring

    [Remote] Software Engineer II - CTJ - Poly

    100% remote Flexible hours

    [Remote] Strategic Account Manager Vaccines, Los Angeles, CA

    100% remote Flexible hours

    [Remote] Production Manager

    100% remote Flexible hours

    [Remote] Senior Mathematician

    100% remote Flexible hours

    [Remote] Clinical Sales Specialist, Surgical Pain - Dayton, OH

    100% remote Flexible hours

    [Remote] Principal Information reputed company Engineer

    100% remote Flexible hours

    [Remote] Product Manager (Data Foundation & Integrations)

    100% remote Flexible hours

    [Remote] Manager, Financial Planning & Analysis

    100% remote Flexible hours

    [Remote] Remote Software Engineer – AI Research & Evaluation

    100% remote Flexible hours

    [Remote] reputed company Analyst

    100% remote Flexible hours

    Epic Hyperspace Delivery Administrator

    100% remote Flexible hours

    Remote Customer Service Representative – Home‑Based Support Specialist for arenaflex Logistics & Delivery Solutions

    100% remote Flexible hours

    reputed company Driver Onboarding Specialist/Data Entry Clerk – Remote Opportunity with arenaflex

    100% remote Flexible hours

    reputed company Virtual Data Entry Assistant – Entry Level Opportunity at arenaflex

    100% remote Flexible hours

    Clinical Study Manager

    100% remote Flexible hours

    Sr. Field Adjuster, Homeowner Claims - reputed company CA (Local)

    100% remote Flexible hours

    reputed company Consultant - Australia (Remote)

    100% remote Flexible hours

    Remote Data Entry Operator – Accuracy, Speed & Digital Records Management (Part-Time & Full-Time Opportunities)

    100% remote Flexible hours

    Remote Customer Service Representative – Compassionate Care & Solutions Specialist for arenaflex’s Health Services Call Center

    100% remote Flexible hours

    Temporário - Product reputed company Pleno

    100% remote Flexible hours