Role Summary
Engineer - HPC Platform. Be a catalyst for innovation in high performance computing at Lilly—transform and accelerate our HPC future by leading the design, build, and operation of scalable HPC platforms and enabling analytical, statistical, and scientific computing capabilities within Lilly’s HPC infrastructure.
Responsibilities
- Lead the engineering and operations of design, build, and maintenance of scalable HPC platforms.
- Enable Lilly’s HPC infrastructure and experiences for researchers and scientists.
- Collaborate with researchers to optimize performance and streamline workflows.
- Leverage tooling and automation for orchestration, resource scheduling, data access, and reproducibility.
- Evolve and operate public cloud and on-premises environments with a focus on availability and performance for HPC workloads.
- Define and monitor infrastructure metrics and resource utilization.
Qualifications
- Hands-on experience in HPC platforms, including accelerators (GPU), HPC schedulers (e.g., Altair Grid Engine, Slurm), Kubernetes platforms, and containers (Docker, Apptainer).
- Demonstrated experience in HPC workloads, infrastructure, and cluster architectures.
- Expertise with Linux command line, Linux troubleshooting, and HPC administration.
- Experience with DevOps tools such as GitHub, Chef, Ansible, and Terraform.
- Experience with automating infrastructure and applications.
- Strong programming and scripting skills in Python or Bash.
- Bachelor’s degree in Computer Science, Information Technology, or related technical field.
- 2+ years’ experience as an HPC Platform Engineer.
Skills
- Platform engineering
- HPC infrastructure and cluster management
- Automation and orchestration
- Cloud and on-premises HPC integration
- Performance optimization
Education
- Bachelor’s degree in Computer Science, Information Technology, or related technical field
Additional Requirements
- Demonstrated experience leading a global large-scale infrastructure project.
- Hybrid role located in Indianapolis, IN (relocation required). < 5% travel.