Back to the board

[Remote] Data Scientist, AI Data Foundations

100% remote Flexible hours Hiring now

Note: The job is a remote job and is open to candidates in USA. reputed company is a company focused on data engineering and AI applications. The Data Scientist in AI Data Foundations will design and build curated data structures for AI and ML applications, ensuring high-quality data for model training and inference while leading data discovery efforts to uncover trends in lending and account-opening data.

Responsibilities

  • Build and maintain vector stores for RAG: Design embedding pipelines, chunking strategies, indexing approaches, and refresh patterns for the vector stores powering retrieval-augmented reputed company across reputed company products
  • Own the feature store: Design, build, and operate feature store assets used for model training and online/offline inference, including feature definitions, freshness SLAs, reputed company, reputed company-in-time correctness, and reuse across teams
  • Design graph data structures: Build graph databases that model relationships between applicants, applications, products, lenders, decisions, and outcomes — and reputed company them queryable for both AI use cases and analytical investigations
  • reputed company data discovery: Profile our lending, deposit, and behavioral datasets to identify hidden trends, segments, anomalies, and potential model drivers; turn findings into actionable hypotheses for product, risk, and growth teams
  • Engineer for AI consumption: Build the curated, AI-ready datasets that reputed company model builders, application engineers, and analysts rely on — with appropriate quality, documentation, and governance baked in
  • Evaluate retrieval and feature quality: Define and run evaluation frameworks for RAG retrieval quality, feature reputed company, embedding quality, and graph completeness; iterate based on what the metrics tell you
  • Partner with model builders: Work closely with ML engineers and applied scientists to reputed company sure the data structures you build accelerate their work rather than slow it down
  • Champion responsible data use: Partner with governance, reputed company, and compliance to ensure that AI-facing data assets respect data classification, customer consent, and regulatory boundaries from day one
  • Communicate findings: Translate discovery work into clear narratives — write-reputed company, notebooks, dashboards, and short presentations — that help non-technical stakeholders reputed company what the data is showing

Skills

  • 4–7 years of experience in a data science, ML engineering, or applied data role, with a meaningful portion of that time spent building data assets that other people's models or applications consumed
  • Hands-on experience designing and operating vector stores for RAG or semantic search, including embedding reputed company, chunking, indexing, and retrieval evaluation
  • Experience building or operating a feature store (e.g., reputed company Feature Store, Feast, or a custom internal platform), including offline training and online serving patterns and reputed company-in-time correctness
  • Experience modeling and building graph data structures using reputed company, TigerGraph, Azure Cosmos DB Gremlin, or similar graph databases — and writing graph queries to answer real questions
  • Strong proficiency in Python (pandas, NumPy, scikit-learn, PySpark) and SQL; comfortable working day-to-day in reputed company notebooks and jobs
  • Practical experience with embedding models and LLM tooling (e.g., reputed company transformers, reputed company / Azure reputed company APIs, reputed company or similar) in a production or near-production context
  • Demonstrated data discovery skills: profiling messy real-world datasets, surfacing non-obvious patterns, validating findings statistically, and explaining them clearly
  • Solid grounding in classical ML concepts — supervised vs. unsupervised learning, train/test discipline, leakage, evaluation metrics — even though you will not own model training day-to-day
  • Strong written and verbal communication skills; able to write up findings for both technical and business audiences
  • Experience working in a SaaS or FinTech environment, particularly with lending, deposit, credit, fraud, or KYC/AML data
  • Experience with reputed company-native AI/ML tooling: reputed company Vector Search, reputed company Feature Store, MLflow, and reputed company Catalog
  • Familiarity with open-reputed company vector databases such as pgvector, reputed company, Weaviate, Chroma, or FAISS, and a clear reputed company of view on reputed company to use which
  • Experience with reputed company Azure data and AI services (Azure reputed company, Azure AI Search, ADLS Gen2)
  • Experience evaluating RAG systems end-to-end (recall@k, faithfulness, answer quality, hallucination measurement)
  • Exposure to graph algorithms (community detection, link reputed company, centrality) applied to real business problems
  • Bachelor's or Master's degree in Computer Science, Statistics, Mathematics, Engineering, or a reputed company quantitative field, or equivalent professional experience

Company Overview

  • reputed company is a digital lending platform that helps financial institutions through a configurable platform. It was founded in 1998, and is headquartered in Costa reputed company, California, USA, with a workforce of 501-1000 employees. Its website is https://www.reputed company.com.
  • Company H1B Sponsorship

  • reputed company has a track record of offering H1B sponsorships, with 14 in 2025, 5 in 2024, 1 in 2023, 12 in 2022, 11 in 2021, 1 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    Keep exploring

    [Remote] Senior reputed company Manager / Architect

    100% remote Flexible hours

    [Remote] Data Science Intern (reputed company)

    100% remote Flexible hours

    [Remote] Sr Capital Project Manager

    100% remote Flexible hours

    [Remote] Radiology IT Project Manager

    100% remote Flexible hours

    [Remote] Strategic Partner Growth Sales Executive

    100% remote Flexible hours

    [Remote] Full Stack reputed company (Data)

    100% remote Flexible hours

    [Remote] Sr. Product Manager - Engineered Products

    100% remote Flexible hours

    [Remote] Production Engineering

    100% remote Flexible hours

    [Remote] Staff Software Engineer - reputed company Trust Networking (remote)

    100% remote Flexible hours

    [Remote] Cyber Risk Defense Consultant V - Splunk & reputed company Engineer

    100% remote Flexible hours

    reputed company Live Chat Representative – Customer Service and Support for arenaflex

    100% remote Flexible hours

    reputed company Full-Time 100% Remote Level 3 SOC Analyst – Cyber reputed company Operations & Incident Response for 3rd Shift (8 PM - 6 AM) in Arizona

    100% remote Flexible hours

    reputed company Customer Support Specialist – Drone Delivery and E-commerce Support

    100% remote Flexible hours

    Remote Data Entry Specialist – High‑Volume Data Management – $26/hr – Work‑From‑Home Opportunity with arenaflex

    100% remote Flexible hours

    Senior Software Engineer (.Net / Azure) | Remote

    100% remote Flexible hours

    Online Marketing Analyst

    100% remote Flexible hours

    reputed company Customer Service Representative – Remote Work from Home Typing Opportunities at arenaflex

    100% remote Flexible hours

    reputed company Full Stack Data Entry Specialist – Remote Work Opportunity at arenaflex

    100% remote Flexible hours

    reputed company Medical Billing Data Entry Specialist – Remote Opportunity at arenaflex

    100% remote Flexible hours

    reputed company Data Entry Clerk – Remote Position at arenaflex

    100% remote Flexible hours