Senior AIML Optimization Engineer

GSK

Remote friendly (Cambridge, MA)

United States

$136,950 - $228,250 USD yearly

Role Summary

Senior AIML Optimization Engineer based at Cambridge, MA (with potential coordination across sites). The role sits within the Onyx Research Data Tech organization, focusing on optimizing first-in-class Compute and AIML platforms to accelerate application development, scale computational experiments, and integrate computation with metadata, logs, and performance tracking across Cloud and High-Performance Computing environments. You will contribute to the design and delivery of scalable, performance-driven optimization software for the Compute and AIML Platforms, enabling end-to-end lifecycle support from interactive explorations to production deployment.

Responsibilities

Serve as a key engineer for the optimization team and contribute technical expertise to teams in closely aligned technical areas such as DevOps, Cloud and Infrastructure
Lead design of major optimization software components of the Compute and AIML Platforms, contribute to development of production code and participate in both design reviews and PR reviews
Accountable for delivery of scalable solutions to the Compute and AIML Platforms that supports the entire application lifecycle (interactive development and explorations/analysis, scalable batch processing, application deployment) with particular focus on performance at scale
Partner with both AIML and Compute platform teams as well as scientific users to help optimize and scale scientific workflows by utilizing deep understanding of both software as well as underlying infrastructure (networking, storage, GPU architectures, …)
Participate or leads scrum team and contribute technical expertise to teams in closely aligned technical areas
Able to design innovative strategy and way of working to create a better environment for the end users, and able to construct a coordinated, stepwise plan to bring others along with the change curve
Standard bearer for proper ways of working and engineering discipline, including CI/CD best practices and proactively spearhead improvement within their engineering area

Qualifications

Required: Bachelor’s, Master’s or PhD degree in Computer Science, Software Engineering, or related discipline
Required: 6+ years of experience with Bachelor's, 4+ years with Master’s, or 2+ years with PhD in cloud computing, scalable parallel computing paradigms, software engineering, and CI/CD
Required: 2+ years of experience in AIML engineering, including large-scale model training and production deployment
Preferred: Deep experience with at least one interpreted and one compiled language (e.g., Python, C/C++, Scala, Java) and toolchains for documentation, testing, and operations/observability
Preferred: Deep experience with application performance tuning and optimization in parallel and distributed computing, including MPI, OpenMP, Gloo, and understanding of underlying systems (hardware, networks, storage)
Preferred: Expertise in modern software development tools and practices (Git/GitHub, DevOps tools, metrics/monitoring)
Preferred: Cloud expertise (AWS, Google Cloud, Azure) with infrastructure-as-code tools (Terraform, Ansible, Packer) and scalable cloud compute technologies (Google Batch, Vertex AI)
Preferred: Expertise in AIML training optimization, including distributed multi-node training and acceleration of training jobs
Preferred: Understanding of ML model deployment strategies, including agent systems and scalable LLM model inference in multi-GPU, multi-node environments
Preferred: Experience with CI/CD implementations using Git and a CI/CD stack (e.g., Azure DevOps, CloudBuild, Jenkins, CircleCI, GitLab)
Preferred: Experience with Docker, Kubernetes, CNCF ecosystem, and deployment tools such as Helm
Preferred: Experience with low-level build tools (make, CMake) and optimization at build/compile level
Preferred: Proficiency with agile software development environments using Jira and Confluence

Skills

AI/ML optimization and scalable model training concepts
Cloud and HPC platforms, infrastructure and DevOps coordination
Parallel and distributed computing paradigms, performance tuning, and profiling
Containerization and orchestration (Docker, Kubernetes) and associated tooling
CI/CD workflow design and implementation
Programming languages: Python, C/C++, Java, Scala or similar
Experience with monitoring, observability, and software development tooling

Education

Bachelor’s, Master’s or PhD in Computer Science, Software Engineering, or a related discipline

Apply now

Share this job

Senior AIML Optimization Engineer

Role Summary

Responsibilities

Qualifications

Skills

Education

More jobs

Delivery Manager, People Technology

Bristol Myers Squibb

Director Information Security

Edgewise Therapeutics