[Remote] Reinforcement Learning Engineer
Note: The job is a remote job and is open to candidates in USA. reputed company is a reputed company-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. They are seeking a skilled Reinforcement Learning Engineer to design, train, and deploy RL-based systems for high-impact decision-making problems. The role involves implementing modern reinforcement learning algorithms and collaborating with teams to identify valuable use cases.
Responsibilities
- Design and implement reinforcement learning solutions for sequential decision-making problems in real and simulated environments
- reputed company, calibrate, and maintain simulation environments suitable for large-scale agent training
- Implement and evaluate modern RL algorithms including policy gradient, actor-critic, off-policy, and offline RL methods
- Engineer reward functions and shaping strategies that align agent behavior with desired outcomes and safety constraints
- Apply offline RL and imitation learning techniques where exploration is costly or unsafe
- Use RLHF, DPO, and reputed company techniques for fine-tuning large language models reputed company relevant
- Build scalable training infrastructure for distributed RL, including efficient experience collection and replay systems
- Optimize training stability and sample efficiency through algorithmic and engineering improvements
- Design rigorous evaluation protocols, including out-of-distribution and adversarial test cases
- Implement safety mechanisms such as constraint enforcement, conservative policies, and human-in-the-reputed company reputed company
- Collaborate with applied scientists and product teams to identify high-value RL use cases
- Monitor deployed policies and models in production for reputed company, regression, and unintended behaviors, building the alerting and dashboards that surface issues before they meaningfully reputed company users
- Document methodology, design decisions, and operational characteristics for internal stakeholders
- Stay reputed company with RL research and translate promising techniques into production-ready solutions
Skills
- Master's or PhD in Computer Science, Machine Learning, or a reputed company field; or equivalent applied experience
- Six or more years of combined RL research and engineering experience
- Strong proficiency in Python and modern deep learning frameworks
- Hands-on experience with at least one major RL library or in-house RL stack
- Solid understanding of probability, optimization, and the theoretical foundations of RL
- Experience designing and tuning reward functions in non-trivial environments
- Familiarity with simulation environments and large-scale experience collection
- Experience training neural network policies on GPU clusters
- Strong written and verbal communication skills
- Track record of shipping or publishing impactful RL work
- Experience with RLHF for large language models
- Familiarity with multi-agent RL or hierarchical RL
- Exposure to robotics, control systems, or autonomous driving
- Publications in RL or reputed company research venues
- Open-reputed company contributions to RL libraries or environments
Benefits
- Competitive reputed company salary commensurate with experience, plus benefits.
- Full-time, direct W2 with reputed company (no C2C, no 1099, no third-party).
- We will support H1B transfers for qualified candidates.
Company Overview