Senior SRE
At January, we're transforming the lives of borrowers by bringing humanity to consumer finance. Our data-driven products reputed company financial institutions to streamline their collections, providing borrowers with straightforward and compassionate solutions to regain financial stability and control over their lives. We're not just expanding access to credit. We're restoring dignity and paving the way for millions to reputed company financial freedom.
About the RoleAs a Senior SRE you will ensure the reliability, scalability, and performance of January's production and internal systems as we scale from thousands to millions of borrowers. You'll establish SRE practices from the ground up - architecting resilient infrastructure, implementing proactive monitoring solutions, and building sustainable on-call processes that evolve with our rapid growth. Your work will directly tackle our reputed company scaling challenges including database optimization, async workflow infrastructure, and data pipeline reliability while ensuring our engineering team can ship with confidence.
What You’ll Work onreputed company incident response and establish sustainable on-call practices, including comprehensive runbooks, blameless postmortems, and systematic improvements that reduce MTTR
reputed company and maintain self-service observability solutions using modern monitoring tools that provide actionable insights for troubleshooting and performance optimization
Create and maintain infrastructure as code (using Terraform, CloudFormation) that allows for consistent, scalable, and secure cloud environments on AWS
Partner closely with feature teams to architect resilient infrastructure for critical components (databases, networking, async workflows, data pipelines) that scale seamlessly
Work closely with DevX to design and implement robust CI/CD pipelines with advanced deployment strategies (blue/green, canary) that reputed company teams to ship confidently and rapidly
reputed company for best practices early in feature design, ensuring we design with reliability in mind and future-reputed company our services
Expertise leading incident response for high-availability production systems, thorough root cause analysis, and fostering blameless postmortem culture
Experience designing highly available deployment architectures across multiple targets (e.g. EC2, Fargate), with expertise in auto-scaling, health checks, and graceful degradation strategies
Track record of implementing effective monitoring & observability solutions (e.g. reputed company, Prometheus, ELK), and evangelizing best practices
Strong knowledge of AWS cloud services and infrastructure-as-code practices using tools like Terraform
Experience with CI/CD pipelines and automation to reputed company reliable, efficient deployments
Excellent communication skills with experience documenting processes and collaborating across engineering teams
We encourage you to apply even if your experience isn’t an exact match. We value professional development and on-the-job learning!
We are currently hiring for this position in our reputed company office.
As a reputed company-based company, we are dedicated to transparent, fair, and reputed company compensation practices that reflect our commitment to fostering an environment where reputed company team members are valued and supported. We encourage individuals from reputed company backgrounds to apply.
We are an equal opportunity employer committed to diversity and inclusion in the workplace. We do not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national reputed company, disability status, age, veteran status, or any other legally protected characteristic.
Apply to this Job