Primary Responsibilities
- Active Learning & Multi-Objective Optimization: Design and establish Active Learning pipelines for multi-objective optimization balancing affinity, specificity, stability, immunogenicity, and manufacturability; include multi-property guidance, Pareto-optimal search strategies, and uncertainty quantification.
- Reward & Surrogate Modeling: Design and train reward models and discriminative classifiers (e.g., affinity ranking, stability prediction, developability scoring) as objective functions for optimization loops.
- Reinforcement Learning for Generative Model Alignment: Develop and implement RL strategies (PPO, DPO, reward-weighted approaches) to fine-tune generative models (autoregressive transformers, diffusion models) toward biologic sequences with desired therapeutic properties; assess when RL vs Bayesian Optimization/Active Learning is warranted.
- Agentic DMTA Pipelines: Build AI-orchestrated, semi-autonomous pipelines connecting generative design, property prediction, experiment selection, and result interpretation with human oversight; optimize for scientific rigor.
- Cross-Functional Leadership: Lead joint data reviews with dry and wet-lab scientists; collaborate with protein engineers, structural biologists, and automation scientists to encode domain knowledge as reward signals, action constraints, and optimization boundaries.
- Scientific Communication: Publish in top-tier venues and present at conferences.
Basic Requirements
- Ph.D. in Machine Learning, Computer Science, Computational Biology, Physics, Applied Mathematics, or closely related quantitative field.
- 1–3 years post-Ph.D. industry R&D experience or relevant postdoctoral appointment.
Preferred Qualifications
- Bayesian Optimization, Active Learning, sequential decision-making under uncertainty.
- RL fluency (PPO, DPO, RLHF-style alignment, reward shaping) to design/evaluate RL vs simpler methods.
- Deep learning with transformers, diffusion, and flow-based models.
- Software development in Python and PyTorch; distributed training, GPU workflows, production-quality code.
- Protein science/biologics ML familiarity (protein representations/language models such as ESM-family, AbLang); ML for antibody/nanobody/peptide design.
- Generative biologics experience (structure-conditioned generation, inverse folding, de novo antibody design such as Boltz, Chai, RFDiffusion, AF-Multimer).
- RL for molecular property optimization and/or drug discovery optimization.
- Data/compute scaling laws for language models.
- Multi-modal models jointly modeling sequence, structure, and functional annotations.
- Strong publication/presentation record.
- Active open-source contributions.
Benefits/Compensation (explicitly stated)
- Anticipated wage: $166,500 - $244,200.
- Full-time employees may be eligible for a company bonus.
- Comprehensive benefits including 401(k), pension, vacation, medical/dental/vision/prescription, flexible benefits, life insurance, time off/leave, and well-being benefits.