Role Summary
Pfizer’s mission to deliver breakthroughs that change patients’ lives is rooted in our commitment to science and innovation. Within Discovery, Preclinical, and Translational Solutions (DP&TS), we accelerate the journey from target identification to clinical translation by leveraging advanced digital technologies, AI, and data-driven insights.
We’re building a forward-thinking platform engineering team dedicated to delivering secure, scalable, and resilient infrastructure. As a Site Reliability/Operations Engineer, you’ll play a pivotal role in ensuring the reliability, performance, and operational excellence of our cloud-native platforms.
This role is perfect for a high-caliber, well-rounded generalist who thrives in dynamic environments, takes initiative, and enjoys solving complex problems across infrastructure, automation, and observability. You’ll be joining a team that values curiosity, collaboration, and continuous learning. While we expect you to take ownership and solve meaningful problems, you’ll be supported by a friendly, inclusive environment with clear goals, strong mentorship, and a culture of shared success. We believe in setting our team up to thrive—not just deliver.
Responsibilities
- Monitor cloud infrastructure and responding to alerts under guidance of senior engineers
- Deploy and maintain automation scripts and tools (e.g., Terraform, Ansible) and proactively look for improvements to tooling
- Maintain and update observability systems (e.g., Prometheus, Grafana) based on feedback from engineers
- Actively participate in incident response activities and root cause analysis to resolve issues and implement improvements with support from the team
- Collaborate with team members to implement changes and improvements to CI/CD pipelines
- Contribute to documentation and process improvements
- Proactively seek opportunities for skill development and apply security and compliance best practices in daily tasks
Qualifications
- Education: Bachelor’s degree in a relevant field (e.g., Computer Science, Data Science, Bioinformatics, Engineering, or related discipline)
- 4+ years of experience in site reliability, operations, or infrastructure engineering
- Familiarity and experience with AWS or Azure
- Familiarity and experience with Terraform, Ansible, and GitHub
- Understanding of Kubernetes, Docker, and container orchestration
- Good scripting skills (e.g., Bash, Python, Typescript)
- Familiarity and experience with Linux/Unix system administration
- Familiarity with networking, security, and database administration
- Strong problem-solving skills and eagerness to learn in a collaborative environment
- Fluent in English; capable of clear technical communication across scientific and engineering disciplines
Preferred Qualifications
- Experience with observability and logging tools (e.g., OpenTelemetry, Prometheus, Grafana, ELK)
- Knowledge of secrets management (e.g., HashiCorp Vault, AWS Secrets Manager)
- Experience working in regulated environments or with compliance frameworks (e.g., GxP, SOC2, HIPAA)
- Experience working in team-based environments, either professionally or academically
Additional Requirements
- Travel up to 10% may be required for business activities.
- Work Location Assignment: Hybrid