Role Summary
Sr. Director, Data Science, Patient Identification β lead a data science function focused on identifying undiagnosed rare-disease patients and targeting healthcare providers. Develop AI/ML and statistical approaches using real-world data to detect disease patterns, shape data strategies, and drive decisions across the rare disease portfolio. This role involves strategic leadership, cross-functional collaboration, and delivering data-driven patient outcomes. Location: 3 days per week in the San Francisco/Palo Alto area.
Responsibilities
- Spearhead a high-performing data science function focused on patient identification and provider targeting
- Identify, source, and integrate data assets to find rare-disease patients and treat HCPs; define vision, priorities, and success metrics across multiple programs
- Architect scalable analytical solutions using real-world data (claims, EHR, genomics, lab data, imaging, registries)
- Define the roadmap for AI/ML innovation with production-grade reliability
- Foster a collaborative, mission-driven culture with enterprise-wide data and data science impact
- Design predictive models and patient-finding tools using real-world data; apply NLP and LLM techniques to unstructured EMR data
- Pioneer methodologies in AI/ML for patient identification and run experiments to compare approaches
- Build frameworks for model monitoring, retraining, and evaluation in real-world deployments
- Deploy supervised and unsupervised models for patient finding, diagnostic acceleration, and disease progression; translate insights into actionable field strategies
- Develop robust data pipelines, governance, and scalable model-serving infrastructure
- Evaluate and integrate third-party data to enhance model accuracy and reach
- Collaborate with external vendors and internal teams to operationalize analytics across the portfolio
- Promote reproducibility, version control, and MLOps best practices
- Partner with Commercial, Medical Affairs, and Computational Genomics to integrate insights into decision-making
- Engage with key opinion leaders and data partners to identify early signals for models
- Establish program KPIs, dashboards, and reporting to track performance and improve model accuracy
- Ensure HIPAA, privacy, and ethical data governance compliance
- Manage external vendors and partnerships to expand analytics capabilities
Qualifications
- Required: 10+ years of experience in data science or analytics within biotech/pharma; 3+ years in a leadership role
- Required: Expertise in real-world data analytics, patient identification, and segmentation across multiple therapeutic areas; experience with large-scale real-world data (claims, EMR/EHR, genomics, registries, or wearables)
- Required: Experience developing and deploying sophisticated ML/statistical models using large-scale health data; strong Python, R, SQL, TensorFlow, PyTorch skills; knowledge of feature engineering, model explainability, and ML pipeline automation
- Required: Proven success translating analytics into actionable strategies that drive measurable patient or business outcomes
- Required: Bachelor's degree in data science, computer science, statistics, or related quantitative field
- Required: Experience in rare disease analytics or patient-finding programs supporting commercial launches or diagnostic initiatives
- Preferred: Advanced degree (PhD, MS, MPH) in data science, biostatistics, computer science, or related field
- Preferred: Familiarity with generative AI, LLMs, or graph-based learning in healthcare/biomedical data
Skills
- Real-world data analytics
- ML/Statistical modeling at scale (Python, R, SQL, TensorFlow, PyTorch)
- NLP and LLM techniques for unstructured clinical data
- Model monitoring, retraining, and MLOps
- Feature engineering, model explainability, and ML pipeline automation
- Data governance, data pipelines, and integration of heterogeneous data sources
Education
- Bachelorβs degree in data science, computer science, statistics, or related quantitative field
- Preferred: PhD, MS, or MPH in relevant field