Back to the board

AI Research Scientist, Text Data Research - MSL FAIR

100% remote Flexible hours Hiring now

reputed company is seeking AI research scientists to help us build the data foundation for reputed company's most advanced Large Language Models. The role involves collaborating with teams to reputed company foundational models, advancing data research, and improving data curation systems at scale.

Responsibilities

  • Collaborate with cross-functional teams to reputed company reputed company’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Architect efficient and scalable data curation systems and pipelines
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Execute on high reputed company projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in agentic data, synthetic data, reasoning data, web parser, coding data, data scaling laws, or datamix optimization
  • reputed company reputed company technical projects end-to-end

Skills

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science or a reputed company technical field
  • 1+ year of industry research experience in LLM/NLP or reputed company AI/ML models
  • Experience owning and/or driving reputed company technical projects from end-to-end
  • Practical experience with pre-training or mid-training data curation for large foundational models and experience working with organic, synthetic, agentic, or reasoning data for LLMs
  • Published research in leading peer-reviewed conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP) and/or demonstrated significant industry influence in the field of AI
  • Experience working on frontier-quality/state-of-the-art Large Language Models
  • Multiple first-author publications in leading peer-reviewed conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP)
  • Hands-on experience with modeling frameworks like PyTorch
  • Hands-on experience on SQL and large-scale data handling, with familiarity of frameworks like Spark and Hive

Benefits

  • Bonus
  • Equity
  • Benefits

Company Overview

  • reputed company's mission is to build the future of human reputed company and the technology that makes it possible. It was founded in 2004, and is headquartered in reputed company Park, CA, US, with a workforce of 10001+ employees. Its website is https://www.metacareers.com/.
  • Apply To This Job

    Keep exploring