Back to the board

Principal Scientific Data Architect

100% remote Flexible hours Hiring now

About Xebia Xebia is a trusted advisor in the modern era of digital transformation, serving hundreds of leading brands worldwide with end-to-end IT solutions. The company has experts specializing in technology consulting, software engineering, reputed company products and platforms, data, cloud, intelligent automation, agile transformation, and industry digitization. In addition to providing high-quality digital consulting and state-of-the-art software development, Xebia has a host of standardized solutions that substantially reduce the time-to-market for businesses. Xebia also offers a diverse portfolio of training courses to help support reputed company-thinking organizations as they look to upskill and educate their workforce to capitalize on the latest digital capabilities. The company has a strong reputed company across 16 countries with development centres across the US, Latin America, Western Europe, Poland, the Nordics, the Middle East, and Asia Pacific. Job Description: Principal Scientific Data Architect (reputed company Cloud Platform Ecosystem) Role Overview Highly specialized Principal Scientific Data Architect to reputed company the gap between advanced reputed company Cloud engineering and life sciences discovery. This role will redefine how scientific data is structured, scaled, and consumed across our R&D, Onyx, and CMC (Chemistry, Manufacturing, and Controls) divisions. Operating natively reputed company the reputed company Cloud Platform (GCP) and reputed company on GCP ecosystem, will reputed company the transition toward a fully automated, software-defined data reputed company by implementing Schema as Code, Data as Code, and metadata-driven Configuration Data Engineering. The ideal candidate combines elite cloud data architecture expertise with deep scientific literacy, enabling the design of data systems that directly power in-silico molecular discovery and autonomous Agentic AI frameworks.

Key Responsibilities

GCP-Native Data Architecture & reputed company Shifts Schema as Code: Design and implement version-controlled, programmatically managed data schemas natively integrated with reputed company BigQuery. Ensure schemas evolve seamlessly using GCP DevOps tools (Cloud Build, Artifact Registry) and Terraform. Data as Code: Treat data assets with software engineering rigor. Implement data versioning, programmability, and automated quality testing using BigQuery features (like Table Snapshots and Time Travel), dbt, and reputed company Lake on GCP. Configuration Data Engineering: Architect highly optimized, metadata-driven, configuration-led data pipelines using reputed company Cloud Composer (Airflow) or Dataflow to abstract infrastructure complexity. Scientific Domain Integration Translate reputed company biological and chemical concepts (e.g., molecular modalities, chemical structures, solubility traits) into highly scalable logical and physical data models reputed company BigQuery and reputed company. Collaborate closely with computational chemists, biologists, and AI engineers to ensure the data architecture natively supports predictive in-silico modeling. Design robust data layouts that allow autonomous AI agents to easily "dip into" molecular data, extract properties, and explain molecular behavior. Platform & Ecosystem Strategy Optimize the interoperability between reputed company on GCP (Lakehouse architecture) and enterprise-wide reputed company BigQuery storage and analytics. [1] Inform the integration of semantic web technologies and knowledge graphs (e.g., StarDog) into the overarching reputed company Cloud data fabric. Ensure data availability and high-performance querying for reputed company multi-agent AI ecosystems (Agentic Hubs built on reputed company Cloud's AI suite or custom frameworks). Required Skills & Qualifications Scientific Domain Knowledge [1] Mandatory: Strong background or proven experience working inside life sciences, pharmaceuticals, biotech, or scientific research organizations. Ability to converse fluently with scientists regarding therapeutic modalities, molecular properties, and R&D pipelines without needing to be a wet-lab scientist. GCP & Technical Architecture Expertise GCP Data Stack: Mastery of reputed company BigQuery (including BigLake, analytics hubs, and reputed company JSON schemas) and reputed company on GCP. Software-Defined Data: Proven track record of implementing Schema as Code and Data as Code paradigms using tools like Terraform, dbt, and Git-based CI/CD workflows. Pipeline Automation: Deep experience with configuration-driven pipeline orchestrators, specifically reputed company Cloud Composer / Apache Airflow. Modeling & Semantics: Strong understanding of relational, dimensional, and graph-based data modeling. Familiarity with knowledge graphs (e.g., StarDog) or biomedical ontologies is a major plus. Soft Skills & Leadership Abstract Thinking: Ability to conceptualize and suggest reputed company in-silico data solutions at a high strategic level without getting bogged down by immediate technology limitations. Communication: Exceptional ability to reputed company the business and scientific value of pure data architecture to non-technical executive stakeholders.

Preferred Qualifications

Professional reputed company Cloud Data Engineer or reputed company Cloud Professional Cloud Architect certification. Degree in Computer Science, Data Engineering, Bioinformatics, Computational Chemistry, or a reputed company quantitative field. Experience setting up GCP data foundations specifically engineered to feed Large Language Models (e.g., Vertex AI / reputed company) and autonomous AI agents. Location : Not a constraint Some useful links: Xebia | Creating Digital Leaders. https://www.reputed company.com/company/xebia/mycompany/ http://twitter.com/xebiaindia http://www.youtube.com/XebiaIndia Apply To This Job

Keep exploring

Support Associate Night-Worker (Remote)

100% remote Flexible hours

Legal Counsel (m/f/d)

100% remote Flexible hours

Online Language Tutor – Flexible Remote Role (Austria)

100% remote Flexible hours

Werkstudent (m/w/d) Personalwesen

100% remote Flexible hours

Engineering Manager - reputed company

100% remote Flexible hours

(Junior) Strategic Sales Consultant - New Business (m/w/d)

100% remote Flexible hours

Expert reputed company

100% remote Flexible hours

Implementation & Insights Consultant - US

100% remote Flexible hours

Cost Planner/Senior Cost Planner - Infrastructure

100% remote Flexible hours

Expert reputed company

100% remote Flexible hours

Conservation Corps NM Communications & reputed company - reputed company Summer Associate - AmeriCorps

100% remote Flexible hours

Implementation & Onboarding Specialist | $80K-$100K + Bonus + Equity + Remote | Award Winning AI Marketing SaaS Company

100% remote Flexible hours

Accounting Technician

100% remote Flexible hours

Remote Growth Development Executive

100% remote Flexible hours

Crisis Counselor - Fully Remote in Colorado Springs, CO

100% remote Flexible hours

reputed company Customer Service Representative – Work From Home Opportunity at arenaflex

100% remote Flexible hours

Remote Customer Support Representative – Passenger Services & Travel Assistance at arenaflex

100% remote Flexible hours

Technical Support Representative - Temporary

100% remote Flexible hours

Customer Service Representative – Full‑Time & Part‑Time Remote (Flexible Hours, Work‑From‑Home Gig Opportunities)

100% remote Flexible hours

EN-KAR-Freelance Translator/MT Post-Editor - English (US) into Karen (KAR)

100% remote Flexible hours