Role Summary
We’re building a forward-thinking platform engineering team dedicated to delivering secure, scalable, and resilient infrastructure. As a Site Reliability/Operations Engineer, you’ll play a pivotal role in ensuring the reliability, performance, and operational excellence of our cloud-native platforms.
Responsibilities
- Monitor cloud infrastructure and responding to alerts under guidance of senior engineers
- Deploy and maintain automation scripts and tools (e.g., Terraform, Ansible) and proactively look for improvements to tooling
- Maintain and update observability systems (e.g., Prometheus, Grafana) based on feedback from engineers
- Actively participate in incident response activities and root cause analysis to resolve issues and implement improvements with support from the team
- Collaborate with team members to implement changes and improvements to CI/CD pipelines
- Contribute to documentation and process improvements
- Proactively seek opportunities for skill development and apply security and compliance best practices in daily tasks
Qualifications
- Education: Bachelor’s degree in a relevant field (e.g., Computer Science, Data Science, Bioinformatics, Engineering, or related discipline)
- 4+ years of experience in site reliability, operations, or infrastructure engineering
- Familiarity and experience with AWS or Azure
- Familiarity and experience with Terraform, Ansible, and GitHub
- Understanding of Kubernetes, Docker, and container orchestration
- Good scripting skills (e.g., Bash, Python, Typescript)
- Familiarity and experience with Linux/Unix system administration
- Familiarity with networking, security, and database administration
- Strong problem-solving skills and eagerness to learn in a collaborative environment
- Fluent in English; capable of clear technical communication across scientific and engineering disciplines
Preferred Qualifications
- Experience with observability and logging tools (e.g., OpenTelemetry, Prometheus, Grafana, ELK)
- Knowledge of secrets management (e.g., HashiCorp Vault, AWS Secrets Manager)
- Experience working in regulated environments or with compliance frameworks (e.g., GxP, SOC2, HIPAA)
- Experience working in team-based environments, either professionally or academically
Additional Requirements
- Travel up to 10% may be required for business activities.
- Work Location Assignment: Hybrid