Pfizer logo

Sr Manager, Cloud Infrastructure Engineer, Scientific Computing and HPC

Pfizer
Full-time
Remote friendly (Tampa, FL)
United States
$108,700 - $201,400 USD yearly
IT

Role Summary

Pfizer's committed to the application of computational science in the areas of drug discovery and development. As part of this mission, we have recently embarked on a large-scale migration of our computational infrastructure to cloud.

This role leverages extensive experience in cloud engineering and DevOps and requires a hands-on approach to designing and delivering robust High Performance Computing (HPC) solutions supporting computational workloads across the organization.

We are seeking an experienced individual to drive architecture, infrastructure automation, migration and operational excellence. You will collaborate with HPC engineers and scientific computing specialists to develop scalable cloud native infrastructure that underpins modernization of the scientific computing platform.

Responsibilities

  • In this role you will design, implement, operate, and own robust and dependable infrastructure for HPC and ML/AI workloads in a cloud environment (AWS/GCP).
  • Lead containerization, deployment, and operation of user- and admin-facing HPC platforms (Slurm, Open On Demand, Prometheus/Grafana, batch and distributed computing platforms) across cloud environments.
  • Translate stakeholder input into robust, high-performance, scalable, cost effective computing platforms.
  • Partner with HPC specialists (engineers, administrators, and users) to capture institutional knowledge and manual processes in IaC workflows, transforming ad-hoc deployment practices into reproducible, version-controlled, automated procedures.

Qualifications

  • Required: B.S. in computer science, life science, data science or similar fields.
  • Required: 6+ years of experience in cloud infrastructure engineering with a proven track record of developing and supporting robust IaC deployments.
  • Required: Experience managing scientific computing workloads in an enterprise environment.
  • Required: Advanced experience with at least one of AWS and GCP, including knowledge of core compute and storage services relevant to HPC.
  • Required: Solid understanding of cloud networking, identity, and security controls.
  • Preferred: Prior experience with HPC deployment utilities including AWS ParallelCluster, AWS Parallel Computing Services, and Google Cloud Cluster Toolkit.
  • Preferred: Proficiency with distributed computing environments, especially EKS/GKE/Kubernetes.
  • Preferred: Familiarity with HPC environments, job schedulers (Slurm), HPC application containers (Docker, Singularity, Apptainer) and NVIDIA GPU computing.

Additional Requirements

  • Occasional international travel for team meetings and conferences.
Apply now
Share this job