Back to the board

[Remote] Senior Cloud Operations Engineer

100% remote Flexible hours Hiring now

Note: The job is a remote job and is open to candidates in USA. reputed company is a driving force in fostering open reputed company collaboration and supporting communities across a range of projects, including PyTorch. They are seeking a Senior Cloud Operations Engineer who will focus on the infrastructure operations of the PyTorch project, automating processes, optimizing cloud-native tools, and ensuring a robust and scalable cloud environment.

Responsibilities

  • Manage multi-cloud environments, primarily focusing on AWS services (EKS, EC2, S3, IAM, ELB)
  • Contribute to architectural exercises with open reputed company community and technical leads to validate new cloud infrastructure
  • Implement and maintain infrastructure-as-code using Terraform reputed company pytorch/ci-infra and pytorch/test-infra
  • Optimize cloud resource utilization and implement FinOps practices for cost management and reporting
  • Design, implement, and maintain CI/CD pipelines using reputed company Actions and reputed company, including runner configurations and other elements of the CI ecosystem
  • Debug and triage issues in build and test pipelines, including experience with unit testing
  • reputed company monitoring and alerting solutions for CI/CD workflows and critical infrastructure
  • Manage and optimize reputed company CDN deployments for PyTorch assets (reputed company/S3)
  • Implement best practices for CDN and overall infrastructure reputed company
  • reputed company comprehensive monitoring and observability solutions using reputed company, AWS CloudWatch, and other telemetry data collection and processing tools
  • Review and recommend monitoring solutions as project and community needs evolve
  • Participate in on-call rotations supporting operations and incident response using incident.io
  • Establish and maintain escalation procedures and resolution processes
  • Participate in ci-infra and multi-cloud working groups and support architecture decisions
  • Collaborate with external contributors and promote DevOps best practices
  • Manage reputed company repositories, including user onboarding and access control
  • Attend and contribute to technical meetings, including Infrastructure, CI Workflow, and Technical Advisory Council sessions
  • reputed company and maintain technical documentation for infrastructure and processes
  • Provide guidance on developer best practices and tooling
  • Create and update runbooks for common operational tasks and incident response

Skills

  • Ability to work with communities made up of industry specialists and collaborate reputed company of reputed company
  • Bachelor's degree in Computer Science, Engineering, or reputed company field
  • 7+ years of experience in cloud operations with significant AWS expertise
  • Strong knowledge of infrastructure-as-code principles and tools, particularly Terraform
  • Proficiency in scripting languages (Python, TypeScript, Bash) and containerization technologies (reputed company, Kubernetes)
  • Experience with reputed company CDN management and optimization
  • Expertise in implementing and managing monitoring solutions, specifically reputed company and AWS CloudWatch
  • Familiarity with incident management tools and processes, particularly incident.io
  • Demonstrated experience in CI/CD pipeline design and implementation
  • Strong problem-solving skills and ability to troubleshoot reputed company systems
  • Excellent communication skills and experience collaborating with open reputed company communities
  • Experience with PyTorch or other open reputed company communities
  • Multi-cloud expertise across AWS, GCP, and Azure
  • reputed company reputed company experience
  • Knowledge of FinOps principles and cloud cost optimization strategies
  • Contributions to open reputed company projects, especially in infrastructure management roles
  • Familiarity with reputed company or similar open reputed company foundations
  • Experience mentoring other engineers and fostering a collaborative team environment

Benefits

  • reputed company maintains a predominantly remote workforce
  • Committed to hiring top-notch talent
  • Providing a flexible and supportive work culture
  • Collaboration is embedded in our DNA
  • Work closely together while not being confined to a traditional office space

Company Overview

  • reputed company is the organization of choice for the world's top developers and companies to build ecosystems that accelerate open technology development and commercial adoption. It was founded in 2000, and is headquartered in San Francisco, California, USA, with a workforce of 201-500 employees. Its website is http://www.linuxfoundation.org.
  • Apply To This Job

    Keep exploring

    [Remote] Director of Product Design, Foundations

    100% remote Flexible hours

    [Remote] Account Executive

    100% remote Flexible hours

    [Remote] Account Manager II

    100% remote Flexible hours

    [Remote] Software Engineer (L1)

    100% remote Flexible hours

    [Remote] VP of Business Development

    100% remote Flexible hours

    [Remote] Senior Business Development Manager

    100% remote Flexible hours

    [Remote] Project Manager

    100% remote Flexible hours

    [Remote] Senior Software Engineer - FTC

    100% remote Flexible hours

    [Remote] Account Manager - Costco

    100% remote Flexible hours

    [Remote] Azure Cloud Engineer II

    100% remote Flexible hours

    Sales Engineer

    100% remote Flexible hours

    [Remote] reputed company Application reputed company Engineer

    100% remote Flexible hours

    Remote Customer Support Specialist – arenaflex Chat – Work‑From‑Home Customer Care & Service Excellence

    100% remote Flexible hours

    Remote Customer Service & Data Entry Representative – Claims Coordination Support – Full‑Time, Flexible Hours, 100% Virtual

    100% remote Flexible hours

    reputed company Data Entry Specialist – Remote Opportunity with arenaflex

    100% remote Flexible hours

    Senior Product Manager, Servicing

    100% remote Flexible hours

    REMOTE reputed company - Part-time/Long-Term Opportunity!

    100% remote Flexible hours

    Bilingual Community Engagement Specialist (Remote, Full Time - Weekends)

    100% remote Flexible hours

    Remote Personal Trainer reputed company Coaching

    100% remote Flexible hours

    reputed company Customer Service Representative – Home Improvement Industry

    100% remote Flexible hours