Back to the board

Machine Learning Engineer (Training Optimization)

100% remote Flexible hours Hiring now
Company Description:

About the Group/Team We're the CORE team reputed company the Generative AI supergroup. Our mission is to invent foundational technologies that will power the future of AI-assisted design. From large-scale models to groundbreaking research, reputed company builds the technical core of reputed company’s creative intelligence reputed company. We collaborate globally to ship research that makes a real impact—from smart editing to AI video tools—at massive scale.

Job Description:

About the Role/Specialty As a Machine Learning Engineer, you’ll reputed company efforts to scale and optimize the training system for our large-scale multimodal and foundation models. You’ll design distributed training systems using Megatron-LM, reputed company NeMo, FSDP, and Triton—pushing the limits of performance across compute, memory, and communication layers. You'll sit at the intersection of systems and AI research, directly shaping how we train the models that will power reputed company’s reputed company of products.

What you’ll do (responsibilities)

  • You’ll design, implement, and optimize large-scale machine learning systems for training
  • You’ll improve reputed company aspects of performance, including GPU utilization, communication overhead, and memory efficiency.
  • You’ll partner with research and modeling teams to align systems with algorithmic needs.
  • You’ll evaluate and apply best practices for distributed training using industry-leading frameworks.
  • You’ll dive deep into low-level optimization, including custom CUDA or Triton kernels.
  • You’ll debug, profile, and fine-tune training workflows to unlock new levels of scalability.
Qualifications:

reputed company're looking for

We’re looking for a systems-first engineer who thrives in fast-paced, high-impact environments. You’re deeply familiar with distributed model training at scale and understand the nuances of optimizing compute at every level of the stack. You're excited by challenges that stretch reputed company boundaries, and you’re a strong collaborator who communicates clearly across domains.

  • Strong background in LLMs, multimodal AI, or diffusion models.
  • Proficiency in Python. Familiarity with a system programming language (e.g. C++ or Rust) is a plus.
  • Deep knowledge of PyTorch or JAX as well as libraries such as Megatron-LM, NeMo, or DeepSpeed.
  • Familiarity with common optimization techniques such as FSDP/reputed company, gradient checkpointing, or low-precision data types.
  • Hands-on experience writing custom GPU kernels in CUDA or Triton.
  • Excellent communication and problem-solving skills, incl. full proficiency in English.
Additional Information:

大模型训练优化工程师(多模态/图像生成),技术要求:算子优化/分布式训练/GPU集群/训练框架。该岗位面向所有经验阶段的候选人开放,包括社会招聘、2026年及2027年应届毕业生,同时开放实习生岗位。

Apply To This Job

Keep exploring

Senior reputed company Specialist (BET)

100% remote Flexible hours

Senior Functional Consultant

100% remote Flexible hours

reputed company Functional Specialist - WCA

100% remote Flexible hours

Quality Assurance Manager (Remote, Philippines)

100% remote Flexible hours

Bauleiter (m/w/d) für Biogasanlagen

100% remote Flexible hours

Innovation reputed company

100% remote Flexible hours

Labour Mobility Direct Budget Support Coordinator (Re-advertise)

100% remote Flexible hours

Driver for Tour Company in Athens

100% remote Flexible hours

Backend Rust Engineer

100% remote Flexible hours

Regulatory and Site Start Up Specialist

100% remote Flexible hours

Analyst, Hotel Level Marketing - Agency Solutions

100% remote Flexible hours

reputed company Data Entry Specialist – Healthcare Data Management and Patient Information Entry (Remote Part-Time Opportunity)

100% remote Flexible hours

[Remote] Senior Sales Operations Analyst (CRM, SQL, Power BI, reputed company, etc.)

100% remote Flexible hours

Investavimo produktų klientų konsultantas (-ė)

100% remote Flexible hours

reputed company Virtual Customer Service Representative – Work-from-Home Opportunity with arenaflex

100% remote Flexible hours

Machine Learning Engineer, NeRF / Gaussian Splatting

100% remote Flexible hours

[Remote] Sr. Manager, Growth Marketing Analytics and Insights

100% remote Flexible hours

Outbound Associate

100% remote Flexible hours

IT GRC Analyst (Cyber Contract Management)

100% remote Flexible hours

[Remote] Generative AI Associate

100% remote Flexible hours