Role Summary
Principal Data Scientist to provide expertise in modeling and data analytics to support various projects, including predicting disease onset using non-traditional data and approaches. This role involves employing advanced analytical and computational methods to drive data-centric and evidence-based product development.
Responsibilities
- Provide scientific and technical leadership in machine learning and AI.
- Act as a subject matter expert in machine learning, statistical modeling, and mentoring junior colleagues.
- Champion advanced analytics results to non-technical audiences.
- Employ cutting-edge analytical approaches to drive digital, data, and pharmaceutical product development.
- Develop computational and statistical methodologies for advanced analytics.
- Work closely with other disciplines across the organization to deliver cutting-edge analysis to key business questions.
- Apply capabilities spanning machine learning, statistics, mathematics, modeling, simulation, text-mining/NLP, and data-mining.
- Collaborate with internal and external data scientists to scope and execute advanced analytics projects.
- Support the implementation of patient support strategies alongside other teams.
Qualifications
- Required: Masterโs degree in Statistics, Data Science, Mathematics, Physics, Economics, Operations Research, Engineering, or a related quantitative field.
- Required: 7+ years of work experience using analytics to solve product or business problems, coding (e.g., Python, R, SQL), querying databases, or statistical analysis, or a PhD degree.
- Required: Experience with advanced ML techniques (neural networks/deep learning, reinforcement learning, SVM, PCA, etc.).
- Required: Ability to interact with large-scale data structures (e.g., HDFS, SQL, NoSQL).
- Required: Experience with big data analytics platforms or high-level ML libraries (e.g., H2O, SageMaker, Databricks, Keras, PyTorch, TensorFlow, Theano, DSSTNE).
- Required: Experience with Federated Analytics (i.e. FlowerAI).
- Required: Ability to prototype analyses and algorithms in high-level languages (e.g., GitHub, containers, Jupyter notebooks).
- Required: Exposure to NLP technologies and analyses.
- Required: Knowledge of data visualization technologies (e.g., ggplot2, Shiny, Plotly, D3, Tableau, Spotfire).
- Required: Excellent knowledge of English language (spoken and written).
- Preferred: 7 years of work experience using analytics to solve product or business problems.
- Preferred: Experience with biomedical data types, population health data, real-world data, or novel data streams relevant to the pharmaceutical industry.
Skills
- Machine learning and AI leadership
- Statistical modeling and data analytics
- NLP and text mining
- Data visualization and communication to non-technical audiences
- Programming in Python, R, SQL; experience with Jupyter notebooks
- Experience with distributed data systems and big data tools
- Familiarity with federated analytics and modern ML libraries
Education
- Masterโs degree in a quantitative field (or PhD considered)