Machine Learning Scientist/Sr Scientist, Federated Benchmarking & Validation Engineering

Eli Lilly and Company

Full-time

Remote friendly (San Francisco, CA)

United States

$151,500 - $244,200 USD yearly

Operations

Want to see how your resume matches up to this job? A free trial of our JobsAI will help! With over 2,000 biopharma executives loving it, we think you will too! Try it now — JobsAI.

Role Summary

Machine Learning Scientist/Sr Scientist, Federated Benchmarking & Validation Engineering plays an essential role within the TuneLab platform, responsible for identifying, assessing, and implementing algorithmic solutions that leverage diverse datasets while ensuring data privacy and security for biotech partners. This role requires knowledge in small molecule drug development, ADME/Tox, antibody engineering, and/or genetic medicine, combined with data science and statistical analysis to develop models using federated learning. The role focuses on designing algorithms and workflows that accelerate drug discovery and deliver reproducible, generalizable models across deployment scenarios.

Responsibilities

Federated Test Set Design: Architect and implement privacy-preserving protocols for constructing representative test sets across distributed partner datasets, ensuring statistical validity while maintaining data isolation.
Benchmark Suite Development: Create comprehensive benchmark suites covering small molecules (ADMET, solubility, permeability), antibodies (affinity, stability, immunogenicity), and RNA therapeutics (stability, delivery, off-target effects).
Cross-Domain Validation: Develop validation strategies that assess model generalization across different experimental protocols, cell lines, species, and therapeutic indications while respecting partner data boundaries.
Public Dataset Integration: Systematically benchmark federated models against public datasets (ChEMBL, PubChem, PDB, Therapeutic Antibody Database) to establish performance baselines and identify gaps.
Validation Frameworks: Implement time-split or proper scaffold-split validation protocols that assess model performance on prospective data, simulating real-world deployment scenarios and detecting concept drift.
Reproducibility Infrastructure: Build robust MLOps pipelines ensuring complete reproducibility of federated experiments, including versioning of data snapshots, model checkpoints, and hyperparameter configurations.
Statistical Rigor: Design statistically powered validation studies accounting for multiple testing, hierarchical data structures, and non-independent observations common in drug discovery datasets.
Performance Profiling: Develop comprehensive performance profiling across diverse molecular scaffolds, target classes, and property ranges, identifying systematic biases and failure modes.
Platform Integration: Collaborate with engineering teams to integrate validation frameworks with the TuneLab federated learning platform built on NVIDIA FLARE, ensuring scalable and automated testing across partner networks.

Qualifications

PhD in Computational Biology, Bioinformatics, Cheminformatics, Computer Science, Statistics, or related field from an accredited college or university
Minimum of 2 years of experience in the biopharmaceutical industry or related fields, with demonstrated expertise in drug discovery and early development
Strong foundation in experimental design, statistical validation, and hypothesis testing
Experience with ML model validation, cross-validation strategies, and performance metrics
Proficiency in data engineering, pipeline development, and automation

Additional Preferences

Experience with federated learning platforms and distributed computing
Knowledge of regulatory requirements for AI/ML in pharmaceutical development
Expertise in ADMET assay development and validation
Understanding of antibody engineering and characterization methods
Familiarity with RNA therapeutic design and delivery systems
Experience with clinical biomarker validation and translational research
Proficiency in workflow orchestration tools (Airflow, Kubeflow, Prefect)
Strong knowledge of containerization and cloud computing (Docker, Kubernetes)
Publications on model validation, benchmarking, or reproducibility
Experience with GxP compliance and quality management systems
Exceptional attention to detail and commitment to scientific rigor
Strong technical writing skills for regulatory documentation
Portfolio mindset balancing rigorous validation with rapid deployment for partner value

Education

PhD in a relevant field (as listed in Qualifications)

Apply now

Share this job

Machine Learning Scientist/Sr Scientist, Federated Benchmarking & Validation Engineering

Role Summary

Responsibilities

Qualifications

Additional Preferences

Education

More jobs

Vice President, Commercial Strategy and Operations

Olema Oncology

Pharmaceutical Technician Compounding/Dispensing AB2

Pfizer