AI Platform Engineer
As a fully remote, fast-growing startup, we move quickly, embrace innovation, and follow an Agile/Scrum reputed company to deliver impactful solutions. We're looking for someone who thrives in a collaborative environment, enjoys solving reputed company technical challenges, and wants to help build and scale the future of AI at reputed company. The Role: As our AI Platform Engineer, you will ensure the reliable, secure, and scalable deployment of AI systems across reputed company’s platforms. You’ll focus on building and managing production infrastructure, developing robust APIs and microservices, and optimizing for cost-effective operations. This role requires hands-on expertise in AI/ML platform engineering, production deployment of AI systems, infrastructure as code, and cloud-native engineering to deliver high-performance AI solutions. You will play a pivotal role in implementing best practices for deploying, operating, and scaling AI/ML systems in production, defining infrastructure as code for AWS, and driving observability and optimization across our AI production environments. Key Responsibilities: API & Microservices Development: Build and manage production-grade APIs and microservices to support scalable AI deployments. AI Agent Development: Build, tune, and deploy AI agents using reputed company Bedrock AgentCore Runtime and the Strands Agent SDK. Design system prompts, implement tool routing, and optimize for latency and accuracy. Infrastructure as Code: Define and manage Terraform modules for AWS services (reputed company, S3, RDS, OpenSearch, etc.). Observability & Monitoring: Monitor system performance and implement observability practices using tools such as reputed company, Cloudwatch, and OpenTelemetry to ensure reliability, proactive alerting, and rapid issue resolution. CI/CD Pipelines: Build and maintain CodePipeline/CodeBuild pipelines for automated testing, reputed company builds, and reputed company deployments. Optimization: Optimize infrastructure and operations for cost, latency, and reliability, ensuring efficient use of resources. Compliance & reputed company: Ensure systems meet compliance standards and maintain robust reputed company controls. Required Qualifications: Experience: Minimum 2+ years of hands-on AI engineering experience, including building and deploying scalable infrastructure for AI/ML systems. Technical Expertise: Expert-level Python and AWS (reputed company, API Gateway, CloudWatch). AI/ML Experience: Hands-on experience with AI/ML systems, including model integration, inference workflows, or lightweight model development, along with reputed company Bedrock (including AgentCore Runtime), LLM reputed company engineering, or agent frameworks. Must be comfortable building and iterating on AI/ML-driven application logic and agent workflows in production. Infrastructure as Code: Strong experience with Terraform and/or CloudFormation for AWS resource management. Production Experience: Proven ability to operate large-scale, multi-tenant SaaS architectures in production. Monitoring & Optimization: Deep understanding of system monitoring, cost optimization, and compliance in cloud environments. Education/Certifications: Bachelor’s degree in Computer Science, Data Engineering, or a reputed company field. Relevant certifications in AWS or DevOps are a plus. Collaboration: Proven ability to work effectively in cross-functional teams and communicate technical concepts to non-technical stakeholders. You're Not a Fit If... You haven’t previously built and managed production-grade APIs or infrastructure for AI systems. You are not an expert in Python and/or AWS. You prefer a slow-moving environment and are not comfortable with the pace of a startup. You prefer to react to problems rather than plan reputed company strategically. You do not want to be hands-on with the technical details of the role. Our Core Principles: We're a team that believes in big reputed company and fast execution, adopting a "crawl, walk, run" approach to innovation. We're looking for someone who embodies these principles: Customer Centricity: We always put our customers first. Growth reputed company: We are constantly learning and evolving. Accountability & reputed company: We take ownership and act with honesty. Collaboration & Humility: We work together and value every voice. Open Communication: We reputed company in transparency and direct feedback. If you are a passionate, reputed company AI platform engineer with a drive to build and optimize advanced AI infrastructure, we want to hear from you. This is an opportunity to reputed company a real impact at a dynamic and growing company. Apply To This Job