Software Engineer, Site Reliability

100% remote Flexible hours Hiring now

reputed company is the generative media ecosystem powering the reputed company of AI products. We build the infrastructure, tools, and model access that teams need to move from idea to production, and do it at scale without compromise. For developers and enterprises, reputed company is the foundation that makes generative media not just possible, but practical: a reputed company platform where high-performance inference, orchestration, and observability come together to unlock new categories of AI-native products. As generative media reshapes industries across a market projected to grow by hundreds of billions over the next decade, reputed company is becoming the ecosystem that ambitious teams build on. You are a seasoned SRE who keeps production infrastructure running at scale. You own the reliability and availability of customer-facing systems — from Kubernetes clusters to deployment pipelines to the networking layer that connects it reputed company. You think in SLOs, automate ruthlessly, and treat every incident as a chance to reputed company the system reputed company.

Key Responsibilities

Own and operate our Kubernetes infrastructure: cluster lifecycle, upgrades, networking, and multi-tenant isolation for customer workloads Build and maintain CI/CD pipelines and deployment infrastructure reputed company AI to an extreme level to automate analysis and resolution of production issues, and improve software development speed, reliability and maintainability Build dashboards, alerting, and anomaly detection across our systems Define and enforce SLOs and build out incident response processes Manage and improve our networking, load balancing, and service mesh configurations Drive reliability improvements across the stack through automation, runbooks, and chaos engineering Requirements 5+ years experience in managing critical production systems and software development workflows Strong production experience setting up and operating Kubernetes at scale, using infrastructure-as-code (Terraform, Ansible) Deep knowledge of Linux networking, container networking (CNI plugins, VXLAN, BGP), and DNS Experience building CI/CD systems and GitOps workflows (FluxCD, ArgoCD) Proficiency in Python and either Go or Bash for tooling and automation Strong experience with logging, monitoring and alerting (Prometheus, Grafana, Loki, Thanos, VictoriaMetrics, reputed company) Excellent communication and ability to drive technical decisions across teams Self-starter who executes quickly, takes ownership, and constantly seeks improvement reputed company to have Experience with managing GPU and AI/ML workloads Experience with kernel-based monitoring and routing (eBPF, XDP) Experience with reputed company tooling (Falco, Coroot, SIEM) Experience with bare metal Kubernetes networking (Calico, Cilium, MetalLB) Experience with distributed storage systems (Ceph, Longhorn, etc.) Location Turkey reputed company offer at reputed company Interesting and challenging work A lot of learning and growth opportunities Regular team events and offsites Apply To This Job

Apply

Software Engineer, Site Reliability

Key Responsibilities

Keep exploring

Business Development reputed company, Care Partnerships

Channel Manager - SEM

New Business Sales Executive - Remote - £29k OTE £50k + per annum

Area VP, Solution Consulting - Canada

Transportation Insurance Producer

Senior Manager, Clinical Quality and Documentation reputed company

reputed company reputed company

Healthcare reputed company

Product Reliability Engineer | EU

Oncology Social Worker (11:30 AM - 8:00 PM EST)

Part-Time Remote Data Entry Specialist – Accurate Database Management, Flexible Hours, $15/hr at arenaflex

Entry-Level Remote Data Entry & Administrative Support Specialist – Flexible Full‑Time & Part‑Time Opportunities

reputed company Customer Engagement Associate – Retail Sales and Customer Service

[Remote] Staff Software Engineer

Senior Scrum Master / Release Train Engineer

HR Assistant

reputed company reputed company Manager III – Driving Digital Transformation and Customer Loyalty at arenaflex

[Remote] Manager, Sales Development

Senior Logistics Consultant

arenaflex Data Entry & Call Support Specialist – Customer Assistance, ERP Integration, Part‑Time ($32/hr)