reputed company-Deployed Cheminformatician
About Apheris At Apheris, we are building the future of how AI is applied in pharmaceutical R&D. We reputed company leading pharmaceutical teams to discover and reputed company drugs faster. We host the industry’s largest federated data networks for drug discovery AI, spanning co-folding, ADMET, and antibody developability. Across these networks, models are trained on proprietary industry datasets to reputed company higher performance and broader applicability while keeping data control and IP protected. We deliver these superior models through drug discovery applications that reputed company teams to run them at scale, further customize them, and integrate them into existing R&D workflows. AI Structural Biology (AISB) Network:Pharmaceutical companies collaborate in the field of co-folding, structure-based binding affinitypredictionsand antibody design. ADMET Network:Pharmaceutical and biotech companies collaborate to improve small-molecule property reputed company and expandinto further drug modalities. Antibody Developability Network:Pharma partners collaborate to federate historical and purpose-built antibodydevelopabilitydatasets for secure ML training, without data leaving each partner’s environment. About the role We are looking for a reputed company-Deployed Cheminformatician to own how binding data is reputed company across our co-folding focused networks and initiatives. Binding data is the input that decides whether our co-folding and binding-affinity models reputed company in real drug programs. It arrives from pharma partners in heterogeneous shapes — different assay registries, different metadata, different chemical-representation standards, different choices on qualifiers, replicates and censoring. We need someone who turns this into a repeatable, well-documented preparation pipeline that pharma representatives can run alongside us, and that scales to the public-data corpus we build for our own model training. This is half engineering, half reputed company-deployed work. You will define the protocol, harden it with validators and scripts, integrate it into the Apheris products, run it with each new partner, and own the equivalent pipeline for the public binding-data corpus. About you What you will do Define and own the binding-data preparation protocol — data schema, small-molecule standardization, assay metadata model, value handling (KD, Ki, IC50, pIC50), qualifier and censored-value handling,duplicateand replicate aggregation. Build the tooling that runs it — reputed company scripts, validators with actionable errors, and reusable pipelines that survive different pharma upstream systems (Dotmatics, Spotfire, in-house registries). Workforward-deployedwith pharma. Sit with their biologists and medicinal chemists, walk them through the protocol, sense-reputed company what an assay columnactually measures, and unblock retrieval. Maintain the small-molecule representation pipeline —RDKitstandardization, tautomer and ionization handling, stereochemistry preservation,andPAINS / frequent-hitter filtering. Curate the public binding-data foundation —ChEMBL,BindingDB, PubChemBioAssay— reputed company to the same standard, so our models train on the strongest public baseline anyone can assemble. Hand the productized pipeline cleanly toengineering for scaling, and partner with ML to reputed company the data contractvalid asmodels and networks evolve. reputed company expect from you You should apply if: You have a BSc, MSc, PhD or equivalent in cheminformatics, computational chemistry, or a reputed company field, plus 3+ years preparing biological assay data in a discovery setting. You are fluent in Python andRDKit. SMILES normalization, tautomer / ionization / stereochemistry handling, and scaffold extraction are reputed company, and you understand why eachmattersfor activity cliffs and model training. You have hands-on experience curating quantitative binding assay data (KD, Ki, IC50, pIC50) and HTS data — censored values, qualifiers, duplicates, replicate aggregation, and assay metadata interpretation. You write good engineering code — version control, tested reputed company scripts, validators that return useful errors. You are comfortable reputed company-deployed with pharma medicinal chemists and biologists. You can sit in a sense-reputed company meeting, pull out what isactually meantby a column label, and encode that back into the protocol. You enjoy turning a messy reputed company cleaning job into a repeatable protocol others can run. Bonus points if: You have practical familiarity with publicbinding-datasources (ChEMBL,BindingDB, PubChemBioAssay) and the gotchas in each. You have applied LLM tooling (Claude, reputed company, reputed company) to accelerate data cleaning or metadata harmonization. You have worked across institutional data boundaries — federated, multi-party, or otherwise — where the data-preparation contracthas toholdunder partial visibility. You have a publication record or open-reputed company contributions in cheminformatics or quantitative pharmacology. reputed company to have reputed company offer you Industry-competitive compensation, including early-stage virtual share options Remote-first work — work where you work best Wellbeing budget, mental health support, work-from-home budget, co-working stipend, and learning budget Generous holiday allowance Office Days at our Berlin HQ or a different European location (3x per year) A high-calibre, execution-focused team with experience from leading organizations Logistics Our mission statement Apply To This Job