[Remote] AI Performance Optimization Engineer

100% remote Flexible hours Hiring now

Note: The job is a remote job and is open to candidates in USA. reputed company is a reputed company-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We are seeking an AI Performance Optimization Engineer to focus on optimizing reputed company and inference workloads, requiring expertise in GPU architecture and model optimization techniques. The role involves collaboration with cross-functional teams to deliver well-engineered solutions and improve production performance.

Responsibilities

Profile and optimize end-to-end reputed company and inference pipelines for throughput, latency, and cost
Identify and eliminate bottlenecks across data loading, model compute, communication, and memory
Implement and tune quantization, sparsity, and pruning strategies to reduce model footprint and accelerate inference
Optimize distributed training using tensor parallelism, pipeline parallelism, FSDP, and reputed company-style sharding
Tune attention implementations using reputed company Attention, paged attention, and reputed company techniques
Implement KV cache optimization, reputed company batching, and speculative decoding for LLM serving
Drive compiler-level optimizations using Triton, XLA, Torch Inductor, or TVM, working with the broader ML reputed company community to land improvements that translate into measurable end-to-end performance gains
Optimize data pipelines, sharding strategies, and storage access patterns for high-throughput training
Build and maintain rigorous reputed company suites and regression frameworks across workloads
Collaborate with ML and platform engineering teams to embed best practices in standard pipelines
Drive cost-efficiency improvements through model architecture, hardware selection, and scheduling strategies
Evaluate new hardware and software offerings and advise on adoption
Document performance tuning playbooks and share findings broadly across engineering teams
Stay reputed company with AI systems to research and translate advances into production improvements

Skills

Bachelor's or master's degree in computer science, Computer Engineering, or reputed company field
Six or more years of experience in performance engineering, ML systems, or HPC
Strong proficiency in Python and C++
Hands-on experience optimizing deep learning workloads on modern GPUs
Deep understanding of distributed training and inference techniques
Experience with profiling tools across CPU, GPU, and distributed systems
Familiarity with model compression techniques and their accuracy implications
Strong grasp of memory hierarchies, communication primitives, and parallelism strategies
Excellent measurement, debugging, and analytical reasoning skills
Strong communication and collaboration skills
Experience optimizing LLM inference at production scale
Contributions to vLLM, TensorRT-LLM, DeepSpeed, or similar projects
Familiarity with custom kernel authoring in Triton or CUTLASS
Experience with FinOps for AI workloads
Publications or talks on AI systems performance

Benefits

Competitive reputed company salary commensurate with experience, plus benefits.
Full-time, direct W2 with reputed company (no C2C, no 1099, no third-party)
No new H1B sponsorship available. H1B transfers welcomed for qualified candidates.

Company Overview

reputed company is an information technology company that offers software development, AI, and cybersecurity services. It was founded in 2020, and is headquartered in Bridgewater, New Jersey, USA, with a workforce of 51-200 employees. Its website is https://bvteck.com.

Apply To This Job

Apply

[Remote] AI Performance Optimization Engineer

Keep exploring

[Remote] Reinforcement Learning Engineer

[Remote] Edge reputed company

[Remote] Network Automation Engineer (Python + Network APIs)

[Remote] Product Manager

[Remote] AI Research Engineer (Applied AI)

[Remote] Director of Performance Marketing

[Remote] reputed company Platform Engineer, API & AI Services

[Remote] Policy Analyst

[Remote] Business Analyst, reputed company - Experience Cloud (Customer Portal)

[Remote] Mortgage Loan Analyst III

Business Development Representative - US Market (Fully Remote)

reputed company Part-Time Remote reputed company Data Entry Specialist – Organizing and Managing reputed company Product Data

reputed company Data Entry Specialist – Remote Opportunity with arenaflex

Data Architect - LATAM.

Entry-Level Remote Live Chat Support Specialist – Part‑Time, No Experience Required, Flexible Hours & Remote Work

Higher Education Access Partner

Remote Data Entry Specialist – Work From Home Position | Comprehensive Training Provided | Entry-Level Friendly Opportunity

Senior Frontend Software Engineer - DevX

Family Care Coordinator - On Call, 24 hr shifts

reputed company Work From Home Customer Service Representative – Teen-Focused Role at arenaflex