Regeneron logo

Staff Engineer, Data Science (PMPD)

Regeneron
Remote friendly (Tarrytown, NY)
United States
IT

Role Summary

Staff Engineer, Data Science on the Data Enablement and Analytics (DEA) team within PMPD (Preclinical Manufacturing and Process Development). They pair bioprocess engineering expertise with AI/ML capabilities to accelerate biologics development and manufacturing, designing, implementing, and operationalizing models for upstream and/or downstream processes, and turning data into prescriptive guidance and production-grade solutions.

Responsibilities

  • Develop, validate, and maintain mechanistic, hybrid, and data-driven models for cell culture and/or purification processes.
  • Translate complex bioprocess questions into quantitative modeling strategies that inform scale-up, tech transfer, and continuous improvement.
  • Advance PMPDโ€™s broader data-science and digital-maturity initiatives.
  • Collaborate with process engineers, citizen data scientists, IT, and manufacturing colleagues to coordinate modeling efforts enterprise-wide.
  • Build and deploy AI/ML-powered digital solutions on cloud-based analytics platforms.
  • Mentor citizen data scientists and champion best practices in model development, method selection, and code quality.
  • Explore and prototype GenAI approaches (e.g., Retrieval-Augmented Generation) to enhance knowledge management and decision support.

Qualifications

  • Required: Ph.D. in Chemical/Biochemical Engineering, Biotechnology, Applied Mathematics, or related field with 4+ years of industrial experience OR Masterโ€™s with 7+ years. Deep mechanistic understanding of upstream and/or downstream bioprocess unit operations, scale-up/down principles, and critical quality attributes. Demonstrated success modeling bioprocesses via first-principles, hybrid, or data-driven (ML) methods. Strong foundation in AI/ML algorithms (regression, classification, Bayesian methods, deep learning, time-series, probabilistic modeling) and multivariate statistics for process modeling, real-time monitoring, and control. Proficiency in Python and SQL; experience with tools such as JMP, SIMCA, MATLAB is helpful. Proven ability to communicate technical concepts to multidisciplinary stakeholders.
  • Preferred: Hands-on experience with cloud analytics platforms (e.g., Dataiku, Databricks). Strong working knowledge of Quality-by-Design (QbD) principles and statistically rigorous Design-of-Experiments (DoE) for defining design space, optimizing critical process parameters, and informing robust control strategies. Familiarity with PAT and chemometric modeling (e.g., Raman spectroscopy) for bioprocess monitoring and control. Understanding of operations research techniques (e.g., combinatorial optimization, linear programming, mixed integer programming). Exposure to GenAI stacks (LLMs, vector databases, RAG pipelines) and multimodal techniques. Strong publication record in bioprocess modeling or AI for biomanufacturing.

Education

  • Ph.D. in Chemical/Biochemical Engineering, Biotechnology, Applied Mathematics, or related field with 4+ years of industrial experience OR Masterโ€™s with 7+ years.