Eli Lilly and Company logo

Machine Learning Scientist/Sr Scientist, Federated Benchmarking & Validation Engineering

Eli Lilly and Company
Full-time
Remote friendly (Indianapolis, IN)
United States
$151,500 - $244,200 USD yearly
Operations

Want to see how your resume matches up to this job? A free trial of our JobsAI will help! With over 2,000 biopharma executives loving it, we think you will too! Try it now — JobsAI.

Role Summary

Machine Learning Scientist/Sr Scientist, Federated Benchmarking & Validation Engineering at Lilly TuneLab, focusing on designing and implementing validation frameworks for federated models, privacy-preserving test sets, and benchmark suites to accelerate drug discovery while ensuring data privacy and reproducibility.

Responsibilities

  • Federated Test Set Design: Architect and implement privacy-preserving protocols for constructing representative test sets across distributed partner datasets, ensuring statistical validity while maintaining data isolation.
  • Benchmark Suite Development: Create comprehensive benchmark suites covering small molecules (ADMET, solubility, permeability), antibodies (affinity, stability, immunogenicity), and RNA therapeutics (stability, delivery, off-target effects).
  • Cross-Domain Validation: Develop validation strategies that assess model generalization across different experimental protocols, cell lines, species, and therapeutic indications while respecting partner data boundaries.
  • Public Dataset Integration: Systematically benchmark federated models against public datasets (ChEMBL, PubChem, PDB, Therapeutic Antibody Database) to establish performance baselines and identify gaps.
  • Validation Frameworks: Implement time-split or proper scaffold-split validation protocols that assess model performance on prospective data, simulating real-world deployment scenarios and detecting concept drift.
  • Reproducibility Infrastructure: Build robust MLOps pipelines ensuring complete reproducibility of federated experiments, including versioning of data snapshots, model checkpoints, and hyperparameter configurations.
  • Statistical Rigor: Design statistically powered validation studies accounting for multiple testing, hierarchical data structures, and non-independent observations common in drug discovery datasets.
  • Performance Profiling: Develop comprehensive performance profiling across diverse molecular scaffolds, target classes, and property ranges, identifying systematic biases and failure modes.
  • Platform Integration: Collaborate with engineering teams to integrate validation frameworks with the TuneLab federated learning platform built on NVIDIA FLARE, ensuring scalable and automated testing across partner networks.

Qualifications

  • PhD in Computational Biology, Bioinformatics, Cheminformatics, Computer Science, Statistics, or related field from an accredited college or university
  • Minimum of 2 years of experience in the biopharmaceutical industry or related fields, with demonstrated expertise in drug discovery and early development
  • Strong foundation in experimental design, statistical validation, and hypothesis testing
  • Experience with ML model validation, cross-validation strategies, and performance metrics
  • Proficiency in data engineering, pipeline development, and automation

Additional Preferences

  • Experience with federated learning platforms and distributed computing
  • Knowledge of regulatory requirements for AI/ML in pharmaceutical development
  • Expertise in ADMET assay development and validation
  • Understanding of antibody engineering and characterization methods
  • Familiarity with RNA therapeutic design and delivery systems
  • Experience with clinical biomarker validation and translational research
  • Proficiency in workflow orchestration tools (Airflow, Kubeflow, Prefect)
  • Strong knowledge of containerization and cloud computing (Docker, Kubernetes)
  • Publications on model validation, benchmarking, or reproducibility
  • Experience with GxP compliance and quality management systems
  • Exceptional attention to detail and commitment to scientific rigor
  • Strong technical writing skills for regulatory documentation
  • Portfolio mindset balancing rigorous validation with rapid deployment for partner value

Education

  • PhD in a relevant field (as listed in Basic Qualifications)