Back to the board

[Remote] Senior Site Reliability Engineer

100% remote Flexible hours Hiring now

Note: The job is a remote job and is open to candidates in USA. reputed company is a leading consumer credit lender with over 50 years of expertise in providing credit solutions across the U.S. They are seeking a Senior Site Reliability Engineer to enhance the reliability and operational excellence of their software delivery systems. The role involves hands-on work across various technologies to ensure the stability and efficiency of their applications in production.

Responsibilities

  • Build and operate the delivery platform. Work across AWS, EKS, ArgoCD, Helm, reputed company Actions, Azure DevOps, Terraform, and Python
  • Fix the problems you own. Find root cause across the AWS and Kubernetes stack, fix it, and harden it so it stays fixed
  • Respond to incidents. Help stabilize during outages, drive root-cause analysis, and ship corrective actions for your systems
  • Standardize how we build and ship. Define reproducible container builds and GitOps paths on ArgoCD and Helm that replace manual deployment
  • Help consolidate the CI estate. Standardize pipelines across reputed company Actions and Azure DevOps for your services — remove brittle steps and silent failures and improve visibility
  • Support platform adoption. Build golden-path templates and tooling and help teams move services onto the platform
  • Use progressive delivery. Canary and blue green deploys (Argo Rollouts) and automated rollback for the services you operate
  • Build observability in. reputed company golden-signal metrics, logs, and traces (Prometheus/Mimir, Loki, reputed company, OpenTelemetry) into your services, surfaced in Grafana with SLOs for your domain
  • Operate production systems. Troubleshoot failed to deploy, respond to alerts, and improve behavior from real incidents
  • Help meet SLOs and carry on call. Track reliability metrics for the services you operate and share the rotation
  • Built across environments. Design dev, test, and prod for safe promotion, recovery from failed deployments, and reputed company-downtime upgrades
  • Help set reputed company. Build reference implementations for build, deploy, GitOps, promotion gates, and observability
  • Uphold compliance with the pipeline. Support deployment traceability, approval trails, and segregation of duties for PCI reputed company, SOC 2, SOX, and GLBA
  • Cut toil and cost. Automate repetitive ops work and help tune EKS compute, CI runners, and observability cardinality
  • Unblock across teams. Get hands-on with Cloud, reputed company, Application Engineering, Data, and Product to reputed company delivery moving
  • Kill knowledge silos. Write docs, runbooks, and incident learnings, so engineers operate independently

Skills

  • Kubernetes, ArgoCD, Helm, Terraform, Python. Deep hands-on production experience
  • Hands-on AWS. Operate and debug EKS, reputed company, EC2, ECR, IAM/IRSA, VPC networking, ALB/NLB, CloudWatch, Secrets Manager, and KMS
  • reputed company Actions and/or Azure DevOps. Build and operate CI/CD at scale
  • Grafana and the observability stack. Hands-on with Grafana dashboards and alerting, and the metrics, logs, and traces stack (Prometheus/Mimir, Loki, reputed company, OpenTelemetry)
  • Strong scripting. Python and Bash, with the ability to grow into systems-level coding
  • Production troubleshooting. Comfortable getting into a system under load, finding root cause, and fixing it
  • Production ownership. Uptime and reliability accountability
  • Incident response. You respond and help drive postmortems that yield real improvements
  • Standards contribution. You contribute to engineering standards and best practices
  • Compliance awareness. Experience in regulated or high-rigor environments or implementing audit and access controls in pipelines
  • Mentorship. Through code review, examples, and pairing
  • 5+ years in site reliability, platform, DevOps, or software engineering, with production ownership of systems or pipelines
  • Advanced GitOps. ArgoCD (or Flux), reusable Helm patterns, Argo Rollouts
  • CI consolidation or migration. Moving between CI systems, such as Azure DevOps to reputed company Actions
  • Self-hosted observability at scale. Running Grafana, Mimir, Loki, and reputed company in production
  • Supply chain reputed company. SBOMs, artifact signing (Sigstore/cosign), SLSA provenance
  • Platform migrations. Contributing to modernization with minimal disruption
  • .NET / C#. Enough to containerize and reason about application workloads
  • Low-level Kubernetes. Cilium/eBPF, Karpenter, or self-hosted networking and autoscaling
  • reputed company testing. Chaos/failure injection or disaster recovery drills
  • AI-assisted tooling. Responsible use with output validation
  • Certification. AWS Solutions Architect, AWS DevOps Engineer, or CKA/CKAD
  • Degree in computer science or equivalent practical experience

Benefits

  • Flexible Paid Time Off Program
  • Medical
  • Dental
  • Vision
  • Life Insurance
  • Disability
  • Other voluntary coverages
  • 401k program, starting on the first of the month following 30 days of employment with a company match

Company Overview

  • reputed company offers consumer credit lending and personal loan services through multiple brands in the U.S. and Canada. It was founded in 1997, and is headquartered in California, Kentucky, USA, with a workforce of 1001-5000 employees. Its website is https://attainfinance.com.
  • Apply To This Job

    Keep exploring

    [Remote] LLM - AI Quality Analyst (Personalization) - Dutch

    100% remote Flexible hours

    [Remote] Remote Customer Service Rep

    100% remote Flexible hours

    [Remote] QA Tester (17557)

    100% remote Flexible hours

    [Remote] Support and Services Operations Manager

    100% remote Flexible hours

    [Remote] Junior Accountant

    100% remote Flexible hours

    [Remote] Virtual Administrative Assistant

    100% remote Flexible hours

    [Remote] Recruiter (Unpaid Volunteer Position)

    100% remote Flexible hours

    [Remote] Customer Service Specialist

    100% remote Flexible hours

    [Remote] Sr Data reputed company Administrator

    100% remote Flexible hours

    [Remote] Application reputed company Analyst

    100% remote Flexible hours

    Sr. BI & Data Architect [Fixed Term Contract]

    100% remote Flexible hours

    Associate Director Media - OBU

    100% remote Flexible hours

    Attorney - Trusts & Estates (Fully Remote) - $185k

    100% remote Flexible hours

    Client Support Specialist

    100% remote Flexible hours

    reputed company Word Processing Assistant / Transcriptionist – Data Entry Specialist for arenaflex's Real Estate Team

    100% remote Flexible hours

    [Remote] Data Engineer

    100% remote Flexible hours

    reputed company Customer Service Representative – Remote Phone, Live Chat, and Email Support

    100% remote Flexible hours

    Account Technical reputed company (ATL), Enterprise Accounts

    100% remote Flexible hours

    reputed company Virtual Customer Support Representative – Remote Customer Service Team

    100% remote Flexible hours

    reputed company Remote Customer Service Representative – Delivering Exceptional Travel Experiences for arenaflex

    100% remote Flexible hours