Role Summary
The Director, Head of Research Data Integration & Analytics leads the strategic development and implementation of robust, FAIR data analytical systems and advanced AI/ML and predictive modelling solutions across VIDRU. This role conceives and delivers the digital roadmap of the entire research lifecycle, transforming raw experimental outputs into standardized, analysis-ready data assets, and embedding cutting-edge AI methodologies into daily operations by refining high-impact use cases. It establishes and maintains world-class data standards, governance, and quality, accelerating infectious disease research and vaccine development, and leads a team across multiple sites including Cambridge, MA; Rixensart, Belgium; Siena, Italy; and Upper Providence.
Responsibilities
- Provides strategic vision and leadership for research data integration and advanced analytics initiatives within VIDRU Data Sciences, focusing on accelerating infectious disease research through FAIR data and AI/ML and transforming raw experimental outputs into analysis-ready data.
- Leads and manages a multidisciplinary team of data scientists, data architects, and scientific software/research engineers, fostering a culture of high performance, scientific innovation, and continuous professional development.
- Directs the design, development, and implementation of robust, scalable integrated data systems and automated, product-grade data processing and integration pipelines to consolidate and harmonize diverse bio-clinical datasets, including multi-omics, preclinical, translational, and early clinical data.
- Establishes and enforces world-class data standards, quality control processes, and governance frameworks (FAIR principles) to ensure data integrity, reliability, and reusability across VIDRU research initiatives, in collaboration with Research Technologies.
- Drives the development and application of advanced analytical methodologies, including deep learning, biomedical computer vision, and predictive modeling, to extract deep biological and clinical insights from integrated datasets, promoting collaboration and aligning with tech providers on emerging innovations.
- Collaborates with DPLs, VDLs, PILs, TPLs, and clinical sciences teams, as well as lab scientists within Discovery Technologies, to understand data needs and deliver integrated datasets and scalable analytical workflows.
- Partners with experimental scientists to optimize VIDRU data flows, ensuring high-quality data generation aligned with FAIR principles from the outset of experiments.
- Drives innovation in research data integration and predictive analytics by partnering with AIML, Research Tech, and R&D Tech organizations, scaling successful research pipelines into reusable assets using product-grade software development practices.
- Ensures data and analytical deliverables meet high standards of scientific excellence, quality, security, and timeliness, translating complex data into actionable insights with reproducibility and reliability.
- Communicates complex data landscapes, integration strategies, and analytical findings to internal and external stakeholders, bridging biology, data science, and IT engineering, and mentoring scientists in leveraging LLMs and other digital tools.
- Contributes to the VIDRU Data Science strategy and objectives, aligning with the Head of VIDRU Data Sciences and the overall R&D strategy, while maintaining digital fluency with RTech and the R&D Digital Network.
Qualifications
- Required: PhD or equivalent in Data Science, Computer Science, Bioinformatics, Computational Biology, Statistics, Engineering, or related field, with a strong focus on data systems and advanced analytics.
- Required: Eight to ten years of relevant scientific experience, including four years of direct/matrix people management and international leadership responsibilities.
- Required: Proven ability to translate theoretical knowledge into solving practical R&D problems and to lead cross-functional teams as a global reference in the function.
- Preferred: Demonstrated proficiency in designing and implementing robust, product-grade data systems and integration pipelines (e.g., cloud computing) for diverse bio-clinical datasets; enforcing FAIR data governance and data quality standards; applying advanced AI/ML techniques to clinical and molecular data; managing multi-omics data (genomics, transcriptomics, proteomics) with knowledge of FASTQ, BAM, VCF; experience with cloud platforms (GCP, Azure); and experience with version control and automated testing for scientific software.
- Preferred: Strong programming, data analytics, and modelling skills; excellent line management and business understanding of the pharmaceutical industry; knowledge of molecular design vendors and solutions.
Education
- PhD or equivalent experience in Data Science, Computer Science, Bioinformatics, Computational Biology, Statistics, Engineering, or related field.