Bristol Myers Squibb logo

Senior Data Engineer, Translational Data Products

Bristol Myers Squibb
Full-time
Remote friendly (Lawrence, NJ)
United States
IT

Want to see how your resume matches up to this job? A free trial of our JobsAI will help! With over 2,000 biopharma executives loving it, we think you will too! Try it now — JobsAI.

Role Summary

As part of the Translational Data Products team, you will directly support translational medicine leaders in their mission to discover biomarkers that guide patient selection and treatment response for BMS assets. Your work will enable exploratory data analysis that drives crucial biomarker decisions at the heart of translational research.

Responsibilities

  • Enable biomarker discovery Deliver data pipelines and mappings that help translational leaders identify biomarkers (molecular, digital, imaging) for patient stratification and treatment response.
  • Innovate with AI/LLMs Explore and apply cutting-edge approaches (MCP servers, prompt orchestration, auto-schema mapping, LLM-based ETL generation) to accelerate and improve data workflows.
  • Data orchestration Oversee ingestion from diverse sources (vendor feeds, raw instruments, CSV, PDF, etc.), ensuring automated ETL and sample-to-target mapping & transformation (STTM) outputs meet stakeholder needs.
  • Quality and profiling Assess and validate source data, documenting any cleaning, normalization of semantic mapping that needs to be applied for optimal QC, and identify where improvements are required vs merely convenient.
  • Hands-on implementation Build or adapt tools/scripts (Python, SQL, AWS Glue, Databricks, etc.) when automation falls short.
  • Agile team contribution Participate actively in standups, design sessions, sprint demos and innovation discussions.

Qualifications

  • Bachelor's or Master's degree in Computer Science, Data Engineering, Bioinformatics, or related field.
  • 5+ years of experience in data engineering, ideally with exposure to life sciences or healthcare.
  • Strong experience with data integration from heterogeneous sources (structured, semi-structured, unstructured).
  • Proficiency in AWS, Python and SQL, with ability to prototype and automate workflows.
  • Hands-on expertise with ETL frameworks (AWS Glue, Databricks, Airflow)
  • Familiarity with modern AI/LLM approaches for data transformation and semantic mapping is highly desirable.
  • Excellent communication skills to engage both technical and scientific stakeholders.
  • Comfortable in agile, exploratory, scientific environments