Regeneron logo

Staff Data Architect - AI

Regeneron
over 2022 years ago
On-site
Troy, NY
IT

Role Summary

The Staff Data Architect - AI supports the design and implementation of modern data architecture with a focus on enabling AI and machine learning capabilities across our bio-manufacturing operations.

Responsibilities

  • Develop solutions by studying data needs, analyzing user requirements, and following Regeneron software development lifecycle.
  • Create design and documentation of data architecture standards with a focus on building infrastructure ready for AI/ML workloads.
  • Examine processes and systems to optimize, consolidate and analyze diverse data sets including structured, semi-structured and unstructured.
  • Create the technical documentation of solutions utilizing standards, templates, and procedures.
  • Design scalable data pipelines that feed predictive and generative AI models, as well as process monitoring tools.
  • Lead the development and maintenance of enterprise data models and reference architectures, with an emphasis on clean, well-structured data that AI systems can reliably consume.
  • Implement cloud-native data infrastructure (AWS, Azure) including data lakes, feature stores, and model serving layers.
  • Collaborate with data and system owners, data scientists, and AI users to understand data requirements and ensure the architecture supports both current and future AI use cases.
  • Participate in architecture reviews, documenting design decisions and flagging potential risks or gaps.
  • Learn and apply governance and data integrity standards in a GxP environment.
  • Assist in the technical documentation of solutions utilizing standards, templates, and procedures. Independently manage small project related assignments ensuring on time delivery.

Qualifications

  • Knowledge of data modeling, database design, and data pipeline development.
  • Hands-on experience or strong academic/project exposure to architecture design and implementation of data pipelines in Azure and AWS. Experience with Databricks is a plus.
  • Experience with AI/ML data concepts such as feature engineering, data versioning, model training pipelines, or vector databases is a strong plus.
  • Understanding of integration patterns and APIs for connecting disparate data sources.
  • Curiosity about how modern data approaches — data mesh, data fabric, lake house architectures — can support AI at scale.
  • Strong communication skills and a willingness to collaborate across technical and operational teams.
  • Experience with Version Control Software (SVN, Git, etc.).
  • Quality focused with strong attention to detail.
  • Staff: 10+ years relevant experience.
  • Senior Staff: 12+ years of relevant experience.
  • Experience in biotech, pharmaceutical, or other life sciences industries preferred.
  • Cloud platform experience (AWS, Azure), workflow orchestration tools (Airflow, Luigi, Prefect, or similar), containerization technologies and scientific data management systems and experience with using GenAI to enhance own work.

Education

  • BA/BS in Computer Science, Bioinformatics, or related field.