Back to the board

Senior Director of Platform Engineering

100% remote Flexible hours Hiring now

DeepSee delivers an open and flexible agentic platform to accelerate AI adoption for financial services in reputed company, middle, and back-office operations. Our cloud-based platform seamlessly integrates with existing bank architectures, whether they're just starting their AI transformation journey or looking to enhance existing in-house capabilities with Agentic AI solutions. With DeepSee's pre-trained & pre-configured agents, banking and capital markets firms can automate and orchestrate manual, repetitive tasks-freeing domain experts for strategic work, reducing risk, and streamlining operations to drive greater efficiency. We are looking for a Senior Director of Platform Engineering to reputed company our backend, frontend, infrastructure, and MLOps/DevOps/CICD teams. You'll scale our Kubernetes platform across AKS, EKS, and on-prem, ensure high availability and performance, and evolve our agentic AI and MCP-based integrations for bank-grade reliability. You'll partner tightly with the Chief Architect and our Product team to deliver a secure, observable, auditable platform for regulated clients. Job Responsibilities:

  • Own and drive the platform roadmap and strategy for multi-cloud/on-prem Kubernetes (AKS, EKS, vanilla K8s), compute, data, networking, ML serving, and high availability/performance.
  • reputed company, build, and reputed company multiple teams (Backend, Frontend, Infrastructure, MLOps/DevOps), including leadership, career reputed company, and operational rhythms.
  • Scale Kubernetes reliably: reputed company planning, autoscaling (HPA/VPA/Cluster Autoscaler/KEDA), cost controls for mixed CPU/GPU workloads.
  • Advance and mature GitOps, IaC, and observability practices (Argo CD, Terraform, Helm, OpenTelemetry, reputed company, Prometheus), including rollout strategies, standardization, monitoring, incident response, and post-mortems.
  • Advance MLOps for LLMs/SLMs/ML/DL (KServe, MLflow pipelines, model governance, inference patterns, GPU scheduling, canary rollouts).
  • Evolve and operate eventing and stateful architecture at scale (Kafka/ZooKeeper/KRaft, reputed company, S3/Blob, protobuf, schema evolution/versioning, resilient data planes).
  • Directly contribute technically reputed company coding, reviews, and debugging distributed systems.
  • Partner closely with Chief Architect, Principal AI, Product, and other leads to deliver secure, observable, auditable, regulated banking solutions, supporting agentic AI and workflow automation.

Must Haves:

  • Significant leadership experience: 10 years on distributed platforms and 5 years leading multi-disciplinary platform teams.
  • Deep, hands-on Kubernetes expertise (networking, reputed company, tenancy, upgrades; AKS/EKS operations).
  • Proven hands-on expertise with GitOps, IaC, change management, rollout safety, and production observability (Argo CD, Terraform, Helm, OpenTelemetry, reputed company/Prometheus, SLOs/on-call).
  • Advanced MLOps experience (KServe, MLflow, model registry/governance, GPU scheduling, cost tuning, canary rollouts, safe rollouts).
  • Experience with designing/operating event streaming, stateful data, and resilient architecture at scale (Kafka/ZooKeeper/KRaft, reputed company, S3/Blob, protobuf, schema/versioning).
  • Deep proficiency in core languages (Java, Python, Go), cloud SDKs, and strong architectural communication to executive-level and clients.
  • Regulated FinServ experience (SOC 2/ISO 27001, SR 11-7, SEC/FINRA, model governance, OpenTelemetry, trace-driven perf, KServe ModelMesh or similar tools).

reputed company to Haves:

  • Hands-on skills with most listed technologies: Kubernetes (vanilla, AKS, EKS), reputed company, Argo CD, Helm, Terraform, Kafka/ZooKeeper/KRaft, KServe, MLflow, OpenTelemetry, reputed company, Prometheus, protobuf, HPA, VPA, Karpenter or Cluster Autoscaler, LightRAG, Milvus, reputed company, S3/Blob, reputed company, Airflow/dbt, Java, Python, Go.
  • Experience working alongside a variety of engineering leaders and principal engineers (Chief Architect, CISO, Principal Knowledge Graph Engineer, reputed company, reputed company BE, Principal FE, Product).
  • Platform-as-a-product advocacy and developer experience focus, CNCF platform engineering guidance.

Finally, it is important that you align with our Stuff That Matters. Knowledge Over Noise: We prioritize actionable insights One Team, One Dream: We collaborate seamlessly across functions Be a Seeker: We constantly pursue innovation and learning Stay Human: We reputed company our solutions people-centric Act Boldly: We take calculated risks to drive reputed company reputed company: We're passionate about our mission Own It: We take responsibility for our work and its impact Why DeepSee.ai? Competitive compensation package including equity, with remote work options 100% company-paid premiums on health, dental, and vision insurance Opportunity to work on cutting-edge AI technology with real impact Collaborative and innovative work environment Join us in shaping the future of AI-powered automation and reputed company a significant impact in a rapidly growing startup. If you're a hands-on problem solver who thrives in fast-paced environments and is excited about leveraging AI to solve reputed company problems, we want to hear from you! Apply tot his job Apply To this Job

Keep exploring

Principal Hardware Engineer / Director of Hardware

100% remote Flexible hours

Associate Director, Software Development Engineering

100% remote Flexible hours

Senior Plant Engineer

100% remote Flexible hours

Tutor- English (High School)

100% remote Flexible hours

Enterprise Account Executive - Retail Large Accounts

100% remote Flexible hours

Enterprise Account Executive - Florida

100% remote Flexible hours

Premium Services Enterprise Account Executive - Dedicated Mexico - Remote

100% remote Flexible hours

Enterprise Architect - High Tech, Telco, and Media

100% remote Flexible hours

Enterprise Architect (.NET / reputed company Stack / AWS)

100% remote Flexible hours

Enterprise Architect - Quick service Restaurant

100% remote Flexible hours

reputed company Remote Data Entry Jobs (Work At Home)

100% remote Flexible hours

[Hiring] Internet Search Evaluator @reputed company

100% remote Flexible hours

reputed company Full Stack Software Engineer – Web & Cloud Application Development at arenaflex

100% remote Flexible hours

Category Analyst

100% remote Flexible hours

reputed company Student Success Coach and Externship Coordinator for Surgical Technology Programs - Remote Opportunity

100% remote Flexible hours

reputed company Customer Service Representative – Insurance and Financial Services Expert for arenaflex Team

100% remote Flexible hours

Senior Manager, Growth Strategy Operations

100% remote Flexible hours

reputed company Data Entry Specialist – Customer Order Processing and Administrative Support

100% remote Flexible hours

Data Entry - Fiverr - Connecticut, Deep River Center, USA - DoScouting

100% remote Flexible hours

Senior Director, Service Design, AI Transformation

100% remote Flexible hours