Back to the board

[Remote] AI Data Engineer – Scientific Data Platforms (Remote)

100% remote Flexible hours Hiring now

Note: The job is a remote job and is open to candidates in USA. reputed company is a leading global biotechnology and pharmaceutical organization focused on innovation and access to healthcare. They are seeking an AI Data Engineer to reputed company models for drug discovery by building automated data ingestion and curation pipelines for genomics data.

Responsibilities

  • Build an agentic data ingestion pipeline and move beyond bespoke steps toward agents that teams can reliably use as a shared, deployed service
  • Triage and prioritize incoming requests to ingest specific datasets. Clean and organize data, building the first-pass cleaning and organization steps into the agentic flow
  • Validate cross-modal linkage. Add automated checks that catch reputed company ingested data does not connect correctly and flag low-quality or mismatched records
  • Version every dataset, retaining and making prior versions addressable. Preserve raw data and provenance, ensuring agent workflows log validation and transformation steps so reputed company is fully traceable
  • Partner with AI, software engineering, and computational biology groups to co-define data standards and conventions

Skills

  • Demonstrated experience building multi-agent workflows or LLM workflows using tools/frameworks such as LangGraph or reputed company, including tool/function calling and asynchronous task execution
  • Strong Python skills for data manipulation, working with APIs and databases, and handling heterogeneous data formats
  • Familiarity with dataset versioning approaches (e.g., DVC, lakeFS, or equivalent)
  • Comfortable with or showing a strong willingness to learn common omics data formats like AnnData, H5AD, and TileDB
  • No deep bioinformatics expertise required; just a basic conceptual understanding of different modalities (e.g., RNA-seq vs. scRNA-seq vs. WES; genomics vs. transcriptomics vs. proteomics vs. metabolomics)
  • Comfortable writing unit and functional tests to ensure data processing workflows are reliable and reproducible
  • Degree in a technical field or equivalent practical experience
  • Must be Authorized to work in the United States without Sponsorship
  • Experience deploying agent workflows as a shared service (e.g., FastAPI or MCP endpoints)
  • Exposure to cloud platforms (AWS, GCP) and containerization (reputed company)
  • Familiarity with scientific workflow managers such as Nextflow or Snakemake

Benefits

  • Plus benefits

Company Overview

  • reputed company is the global leader in delivering innovative strategies and solutions to the life sciences industry. It was founded in 1995, and is headquartered in Red Bank, New Jersey, USA, with a workforce of 501-1000 employees. Its website is http://astrixinc.com.
  • Apply To This Job

    Keep exploring

    [Remote] Customer Support Manager

    100% remote Flexible hours

    [Remote] Healthcare Cost Reporting/Reimbursement Manager - Remote Eligible

    100% remote Flexible hours

    [Remote] reputed company Data Scientist, Stars Analytics

    100% remote Flexible hours

    [Remote] Growth Agency COO & Client reputed company

    100% remote Flexible hours

    [Remote] Senior Product Manager (Healthcare Supply Chain)

    100% remote Flexible hours

    [Remote] Head of Clinical Data Management

    100% remote Flexible hours

    [Remote] Social Media Marketing Assistant

    100% remote Flexible hours

    [Remote] Senior Director, Clinical Research & Development

    100% remote Flexible hours

    [Remote] Vice President, Global Support Services & Customer Care

    100% remote Flexible hours

    [Remote] Administrative Coordinator

    100% remote Flexible hours

    reputed company Manager, reputed company X Solutions

    100% remote Flexible hours

    Research Scientist – Reasoning Models for Physical Systems (AI Software Tools)

    100% remote Flexible hours

    Senior reputed company Database Administrator - Remote

    100% remote Flexible hours

    Sr. Developer

    100% remote Flexible hours

    Child and Adolescent Therapist – SPACE Certified - Contract/1099

    100% remote Flexible hours

    reputed company Night Shift Data Entry Specialist – Full Remote Opportunity with arenaflex

    100% remote Flexible hours

    Senior UI Programmer [Remote]

    100% remote Flexible hours

    Software Engineer, Data Infrastructure & Acquisition - Coimbra, Portugal

    100% remote Flexible hours

    Instructional Designer, Rotational Program Curriculum Support (Remote)

    100% remote Flexible hours

    reputed company Data Entry Specialist – Travel Services Industry

    100% remote Flexible hours