Responsibilities:
- Design, implement, and evaluate generative and predictive deep learning architectures—transformers, diffusion models, flow-matching models, and graph neural networks.
- Develop multi-modal embeddings that unify protein sequence, structure, and molecular fingerprints (including novel tokenization and fusion approaches) to improve generation quality and property prediction.
- Research approaches for jointly modeling proteins and small molecules for multi-component biotherapeutic formats (e.g., ADCs, antibody–peptide conjugates, T-cell engagers).
- Integrate physics-based priors, molecular dynamics, and energy-aware learning objectives into model training with internal MD scientists to ground outputs in physical reality and improve developability.
- Scout high-impact AI/ML and computational biology research directions and transfer knowledge to other domain experts.
Basic Qualifications:
- Ph.D. in Computer Science, Artificial Intelligence, Theoretical Computer Science, Applied Mathematics, Computational Biology, Physics, or related field.
- Strong expertise in modern deep learning architectures (transformers, diffusion models, flow-matching networks, variational autoencoders, graph neural networks).
- Proficiency in Python and modern AI/ML frameworks (PyTorch or TensorFlow); familiarity with Git, code review, testing, and documentation.
Preferred Qualifications:
- 1–3 years of industry experience developing and deploying novel deep learning architectures.
- Familiarity with protein engineering and representation; protein language models (e.g., ESM, AbLang) and generative protein models (e.g., RFDiffusion, Boltz, Chai); ML experience in antibody/nanobody/peptide design.
- Experience with multi-modal architectures fusing protein–peptide/protein–ligand/protein–small molecule representations.
- Experience integrating molecular dynamics/force-field representations or physics-based priors into ML for molecular design or optimization.
- Experience with distributed training, GPU-accelerated workflows, and large-scale model training/inference.
- Exposure to experimental biologics workflows (e.g., phage display, yeast display, directed evolution).
- History of high-impact publications and strong oral/written communication.
Location:
- Lilly Biotechnology Center (San Diego) or HQ (Indianapolis, IN); San Diego preferred.