Eli Lilly and Company logo

Engineer - MLOps & Scientific Platforms - Data Foundry

Eli Lilly and Company
8 days ago
Remote friendly (San Diego, CA)
United States
IT
Responsibilities:
- Build and maintain end-to-end ML deployment pipelines (experiment tracking, model versioning with MLflow/W&B, containerized serving, automated retraining triggers).
- Develop model registry and feature engineering pipelines for computational scientists.
- Implement monitoring/alerting for data pipelines, APIs, ML models, and agentic systems (LLMOps).
- Create dashboards/metrics for pipeline execution, API latency, token usage, prediction quality, and system health.
- Establish structured logging and tracing.
- Deploy Methods4Insight capabilities with versioning, structured error handling, and response-time guarantees; productionize in partnership with Tech@Lilly.
- Build serving infrastructure for synchronous and asynchronous workloads; define API contracts, documentation, and testing frameworks.
- Operate cloud-native model serving (AWS/Azure/GCP) using containers, Kubernetes, and IaC.
- Build CI/CD for ML models (validation, A/B tests, canary, rollback) and integrate with Data Foundry pipelines.
- Expose tools via REST APIs / MCP-compatible endpoints for autonomous agents; define latency/throughput targets and graceful degradation.
- Ensure uncertainty quantification and confidence metrics.

Basic requirements:
- B.S./M.S. in CS, Data Science, ML, Bioinformatics, Computational Biology, or related.
- 3+ years in MLOps/ML engineering/scientific platform development.
- US work authorization required; Lilly does not sponsor visas.

Preferred qualifications (examples):
- Python; ML frameworks (PyTorch/TensorFlow/scikit-learn) and lifecycle tools (MLflow/W&B/Kubeflow).
- Production model serving; REST/gRPC; operational monitoring.
- AWS/Azure/GCP, Kubernetes, CI/CD; collaboration.
- Experience with scientific model operationalization; drift/retraining monitoring.
- API gateway/event-driven/service mesh; feature stores/DVC/large-scale tracking; AI agent frameworks (MCP/LangChain).
- C/C++/CUDA/GPU and HPC containerization (Singularity/Apptainer).

Compensation/benefits:
- Anticipated wage: $66,000–$165,000; bonus eligibility; comprehensive benefits incl. 401(k), health, life, time off, and well-being benefits.