Principal Data Scientist β R&D DSDH - Therapeutics Discovery (TD)
Responsibilities:
- Develop ML/AI models for discovery workflows (target prioritization, multi-omics integration, mechanistic inference).
- Apply modern ML (deep learning, graph learning, foundation/generative models) to chemical, biological, imaging, and assay data.
- Build/optimize scalable, interpretable, scientifically rigorous models for R&D use cases.
- Design, build, and maintain data pipelines to curate/standardize/integrate diverse R&D datasets (chemical, biological, multi-omics, imaging, biophysical, automation logs).
- Partner on MLOps/DevOps and deploy ML models into production R&D environments.
- Create tooling for dataset prep, feature engineering, and model lifecycle management.
- Collaborate with TD scientists to translate heterogeneous experimental data into insights for hit discovery, mechanism studies, perturbation experiments, and compound optimization.
- Support experiment design/interpretation and cross-functional AI/ML adoption; uphold data quality, documentation, governance, and reproducibility.
Qualifications:
Required:
- Masterβs or Ph.D. in Computational Biology/Bioinformatics/Data Science/Chemistry/Chemical Biology/Biomedical Engineering/Computer Science or related.
- ML/AI experience in scientific domains (drug discovery, biology, chemistry, systems biology, imaging, etc.).
- Strong Python skills and experience with PyTorch, TensorFlow, scikit-learn, RDKit (or similar).
- Data engineering experience (data modeling, orchestration, ETL/ELT, AWS/GCP/Azure).
- Ability to work directly with experimental scientists.
Preferred:
- Pharma/biotech discovery experience (target assessment, phenotypic screening, medicinal chemistry, lab automation).
- Omics, high-content imaging, chemical structure, or assay data familiarity.
- Knowledge of FAIR/ontologies/controlled vocabularies and regulated/quality-governed environments.
- Strong communication in a matrixed, multidisciplinary environment.
Required/Preferred Skills (from posting): Advanced analytics; data analysis; data quality/governance; data visualization; digital fluency; workflow/process improvement; strategic thinking; technical credibility; data reporting; data privacy standards.
Benefits (time off): Vacation (120 hrs/yr); Sick time (40 hrs/yr; CO 48; WA 56); Holiday pay incl. floating holidays (13 days/yr); Work/personal/family time (up to 40 hrs/yr); Parental leave (480 hrs/yr); Bereavement (240 hrs immediate family; extended 40 hrs/yr); Caregiver leave (80 hrs/52-week rolling); Volunteer leave (32 hrs/yr); Military spouse time-off (80 hrs/yr).
Application instruction:
- For EMEA locations, apply to R-069202.