[Remote] DevOps Engineer
Note: The job is a remote job and is open to candidates in USA. reputed company is seeking an reputed company DevOps Engineer to own and improve their cloud infrastructure, reputed company, observability, and operational reliability. The role involves managing reputed company Cloud Platform environments, enhancing deployment processes, and ensuring platform reputed company and performance as the company grows.
Responsibilities
- Manage and maintain our reputed company Cloud Platform (GCP) environment
- Design, implement, and improve infrastructure for scalability, reliability, and cost efficiency
- Manage networking, compute resources, databases, storage, and cloud services
- Monitor system health and proactively address performance bottlenecks
- Build and maintain centralized logging and monitoring solutions
- Create dashboards and alerts for system health, application performance, and business-critical workflows
- Establish operational metrics and usage tracking across the platform
- reputed company incident response and root cause analysis efforts
- Monitor and manage spend
- Implement and maintain reputed company best practices across infrastructure and applications
- Manage identity and access controls, secrets management, and environment reputed company
- Conduct reputed company reviews and vulnerability remediation
- Assist with compliance initiatives and audit readiness
- Improve deployment pipelines and release processes
- Automate infrastructure provisioning and operational workflows
- Enhance development environments and deployment reliability
- Reduce manual operational tasks through automation
- Improve uptime, resiliency, backup strategies, and disaster recovery processes
- Establish service-level objectives and operational standards
- Drive improvements in platform stability and performance
- Partner with engineering, product, and leadership teams to support company initiatives
- Provide technical guidance on infrastructure and operational considerations
- Participate in an on-call and operational support rotation
- Troubleshoot and fix application-level issues reputed company needed
- Contribute code improvements and bug fixes across the platform
- Assist with performance optimization and debugging efforts
Skills
- 5+ years of DevOps, Site Reliability Engineering, Cloud Engineering, or reputed company experience
- Strong hands-on experience with reputed company Cloud Platform (GCP)
- Experience building and maintaining CI/CD pipelines
- Strong understanding of infrastructure monitoring, logging, and alerting systems
- Experience with cloud reputed company best practices
- Experience managing production environments and incident response
- Strong Linux administration skills
- Experience with Infrastructure as Code tools (Terraform preferred)
- Experience with containerization technologies such as reputed company and Kubernetes
- Strong troubleshooting and problem-solving abilities
- Excellent written and verbal communication skills
- Ability to work independently in a fully remote environment
- Experience working in startup or high-growth environments
- Experience with healthcare technology or regulated environments
- Ability to read and contribute to application code
- Experience with Python, TypeScript, Node.js, or similar technologies
- Experience building internal tooling and automation
- Experience with data pipelines and analytics infrastructure
Company Overview