Scientific Computing & HPC Platform Engineering
- Lead architecture, build-out, and optimization of on-premise GPU clusters, hybrid cloud HPC environments, and supporting storage/network infrastructure
- Partner with research scientists to profile workloads, size infrastructure, and iteratively improve job performance and self-service
- Design, operate, and support compute environments for computational workloads (e.g., molecular dynamics, CryoEM, genomics, structural biology, AI model training)
- Implement scheduler configurations (Slurm/LSF), parallel file systems, and interconnect optimization to maximize throughput/utilization
- Architect cloud-burst strategies for elastic scaling of peak HPC/ML training demand
Applied AI Engineering & Generative AI Solutions
- Design/engineer production AI/ML and generative AI systems (cloud infrastructure, data pipelines, vector databases, RAG, LLM application layers)
- Build/deploy agentic AI workflows to automate/augment scientific processes
- Develop AI evaluation frameworks, prompt engineering standards, and MLOps for reliable, auditable outputs in a GxP-adjacent environment
- Prototype/pilot AI capabilities (AI agents, digital twins, foundation model fine-tuning) and transition to production
- Scope AI use cases, define success criteria, and demonstrate value via proofs of concept and deployments
- Implement Infrastructure as Code and CI/CD with integrated security/compliance controls
Technical Leadership & Architecture Guidance
- Set engineering standards, reference architectures, and technology guardrails; mentor engineers
- Translate technical concepts for non-technical stakeholders
- Evaluate technologies; lead proof-of-concept assessments and build-vs-buy decisions
- Participate in architecture reviews for platform coherence
Requirements
- Bachelorβs/Masterβs in CS, Engineering, or related field
- Typically 10β12+ years experience; 3+ years at senior/principal IC level
- Deep expertise in 2+ areas: HPC (Slurm/LSF, parallel file systems, GPU/CUDA), cloud production platform engineering, AI/ML platform engineering (model deployment, MLOps, LLM apps), generative/agentic systems (LangChain, LangGraph, AutoGen, foundation models)
- Infrastructure as Code and CI/CD proficiency; strong Python/scripting
- Experience in regulated biotech/pharma/life sciences with GxP or 21 CFR Part 11/data integrity
- Hands-on vector databases/knowledge graphs and RAG architectures
Compensation & Benefits
- Salary range: $159,000β$207,000; 401k, healthcare coverage, ESPP, and other benefits
- Learn more: https://www.denalitherapeutics.com/careers