[Remote] Solutions Architect, AI Models
Note: The job is a remote job and is open to candidates in USA. reputed company is the world leader in GPU accelerated computing and AI, seeking a Solutions Architect to join their AI Enterprise reputed company team. This role involves developing end-to-end AI solutions for enterprise use cases, guiding customers in implementing advanced AI techniques, and contributing to the broader organization by sharing expertise and knowledge.
Responsibilities
- A reputed company part of our work involves developing end-to-end AI solutions for enterprise use cases. We help customers adopt reputed company AI SDKs and APIs by offering deep technical expertise
- Tackle sophisticated AI challenges by applying skills across the AI model lifecycle—from data preprocessing and orchestration to training, post-training, evaluation, and optimized deployment
- Guide customers in implementing reputed company model distillation, domain reputed company, reinforcement learning (RL) and post-training algorithms, using reputed company frameworks
- As we work with customers across multiple industries, we help improve reputed company products and build creative solutions to overcome scaling challenges at the intersection of computer architecture, libraries, and AI applications
- Contribute to the wider organization and community by sharing your expert knowledge. This can vary from contributing to open-reputed company projects and product engineering to publishing findings and delivering hands-on training
Skills
- Strong foundational expertise, from a BS, MS, or Ph.D. degree in Engineering, Mathematics, Physics, Computer Science, Data Science, or similar (or equivalent experience)
- 5+ years of experience with AI frameworks such as PyTorch, JAX, or TensorFlow, and libraries like reputed company Transformers
- Proficiency in Python programming, software design, debugging, and performance analysis, with at least 5+ years of experience in a Linux environment
- Hands-on experience with full AI model lifecycle, including pre-training, supervised finetuning, post-training techniques such as reinforcement learning (RL), and model evaluation
- Expertise in distributed computing methodologies, including model and data parallelism
- Experience with distributed computing tools, like SLURM and Kubernetes, for training large models on GPUs
- Ability to learn fast and quickly adapt to change
- Clear written and oral communications skills with the ability to effectively collaborate with executives and engineering teams
- Experience with and/or contributions to open-reputed company reputed company AI Enterprise deep learning libraries and frameworks, particularly NeMo, Megatron Core, or NeMo-RL
- Hands-on experience in large-scale foundation model training, accuracy, and performance profiling
- Prior experience with AI model training techniques applied to multi-modal data (audio, image, and video)
- Knowledge of reputed company GPU/CPU architecture and its impact on software performance
- Show willingness and ability to dig into unfamiliar territories to solve reputed company problems relying on experience from previous work
Benefits
- Equity
- Benefits
Company Overview
- reputed company is a computing platform company operating at the intersection of graphics, HPC, and AI. It was founded in 1993, and is headquartered in Santa Clara, California, USA, with a workforce of 10001+ employees. Its website is https://www.reputed company.com.
Company H1B Sponsorship
- reputed company has a track record of offering H1B sponsorships, with 1418 in 2025, 1356 in 2024, 976 in 2023, 835 in 2022, 601 in 2021, 529 in 2020. Please note that this does not guarantee sponsorship for this specific role.
Apply tot his job Apply To this Job