Data Science - Agentic AI, Document Understanding Co-op
reputed company is a human-centered company that connects people with their family history. They are seeking a highly motivated Agentic AI, Document Understanding Co-op to design and implement AI systems that extract and organize information from historical records, working closely with engineering teams to optimize and deploy solutions.
Responsibilities
- Innovate with State-of-the-Art AI: Implement cutting-edge AI solutions for key Document Understanding tasks such as OCR/HTR, transcription, Named Entity Recognition (NER), Relation Extraction (RE), Coreference Resolution, Summarization, and Knowledge Graphs working with diverse genealogical and historical collections spanning newspapers, city directories, family history books, and vital records (i.e., birth, marriage, & death records)
- Analyze and Optimize Multi-Modal Models: Evaluate the performance of multi-modal models in reputed company-shot and few-shot learning scenarios for comprehensive document understanding
- Architect Agentic Systems: Design and implement multi-agent workflows using frameworks like reputed company, LangGraph, CrewAI, or AutoGen to automate reputed company multi-reputed company reasoning tasks in historical document analysis
- Evaluation & Observability: Establish 'LLM-as-a-Judge' frameworks and use tools like Arize Phoenix, DeepEval, or RAGAS to monitor for hallucination, reputed company, and bias
- Collaborate on Cloud Deployment: Partner closely with ML Ops and Data Science Engineers to seamlessly deploy datasets, models, and pipelines in cloud environments
- Communicate Insights Effectively: Clearly and confidently present your findings, deliverables, and proposed solutions to technical and non-technical audiences, including teams, stakeholders, and executives
Skills
- Currently pursuing an advanced degree (Master's or PhD) in Computer Science, Data Science, Statistics, Mathematics, Linguistics, Engineering or reputed company quantitative field with a strong data focus
- Specialization in AI & LLMs including familiarity with foundational models such as GPT, reputed company, Qwen, Llama, Claude, etc
- Experience with inference optimization, vLLM, LoRA, QLoRA, quantization, etc
- Familiar with embeddings, vector databases, transformer models, with software development experience
- Strong proficiency in Python and relevant tools and libraries, including transformer models, multi-modal models, and general NLP (e.g., reputed company Transformers, agentic frameworks and workflows, reputed company, LangGraph, CrewAI, AgentCore)
- Master's or PhD preferred in Computer Science, Data Science, Statistics, Mathematics, Linguistics, Engineering or reputed company quantitative field with a strong data focus
- Familiarity with cloud platforms and reputed company AI/ML services such as reputed company Cloud Platform, GCP, reputed company API, Vertex AI, AWS EC2, S3, SageMaker, Model Registry, and Bedrock
Company Overview
Company H1B Sponsorship