Position Overview:
- Develop and deploy foundational AI models that transform drug discovery across Takeda, including LLMs, diffusion models, and multimodal architectures integrating omics, biomedical imaging, protein 3D structures, and molecular representations.
- Train models from scratch, fine-tune pre-trained models for Takeda-specific applications, and deploy foundation model capabilities to accelerate discovery across therapeutic platforms.
Accountabilities:
- Develop and train foundational AI models (LLMs, diffusion models, flow-matching architectures) for drug discovery, including pre-training on large-scale scientific corpora and molecular datasets.
- Fine-tune and adapt pre-trained foundation models (protein language models, chemical LLMs, vision transformers) for target identification, disease modeling, and molecular design/discovery.
- Build multimodal foundation models integrating omics, biomedical imaging, protein 3D structures, and molecular representations.
- Apply and extend state-of-the-art approaches including graph neural networks, transformer-based protein language models, and multimodal learning frameworks.
- Use domain expertise in biology/chemistry/disease biology to guide architecture decisions, training data curation, and evaluation.
- Implement generative architectures (diffusion, score-based models, autoregressive transformers) for molecular generation, protein design, and multi-objective optimization.
- Collaborate to deploy foundation models across small molecules, biologics, and emerging modalities.
- Stay current and contribute to internal knowledge sharing and external publications.
Education & Requirements (Required):
- PhD (CS/ML/Computational Biology/Bioinformatics or related) or MS with 6+ years, or BS with 8+ years relevant experience.
- Deep expertise in modern deep learning (transformers, diffusion, and/or generative models).
- Strong experience training large-scale models; proficiency in PyTorch and distributed training.
- Foundational knowledge of biology, chemistry, or disease biology.
- Experience with at least one: protein language models (ESM, ProtTrans), molecular generative models, or biomedical vision models.
- Experience with cloud computing (AWS/GCP) and GPU cluster training at scale.
Preferred:
- Experience building/fine-tuning foundation models in pharmaceutical/life sciences.
- Expertise in multimodal learning (text, images, structured molecular data).
- Omics data analysis experience (genomics, transcriptomics, proteomics) and knowledge graphs.
- Familiarity with protein structure prediction and 3D molecular representations.
- Publications in NeurIPS/ICML/ICLR or computational biology journals.
- Experience with model compression/efficient inference/production deployment.
- Strong large-scale data integration and multimodal modeling background.
- Proficiency in Python and ML libraries (PyTorch, TensorFlow, scikit-learn); familiarity with Unix.
- Excellent collaboration and communication.
Benefits:
- U.S. based employees may be eligible for medical, dental, vision insurance, 401(k) with match, disability coverage, basic life insurance, tuition reimbursement, paid volunteer time off, company holidays, well-being benefits, up to 80 hours sick time/year, and up to 120 hours paid vacation for new hires.
- U.S. based employees may be eligible for short-term and/or long-term incentives.
Compensation:
- U.S. Base Salary Range: $111,800.00 - $175,670.00
Application Instructions:
- Apply via the βApplyβ button (employment application processing begins upon clicking; application information is processed per Takedaβs Privacy Notice and Terms of Use).