Role Summary
Principal Data Scientist to provide expertise in modeling and data analytics to support various projects, including predicting disease onset using non-traditional data and approaches. This role involves employing advanced analytical and computational methods to drive data-centric and evidence-based product development.
Responsibilities
- Provide scientific and technical leadership in machine learning and AI.
- Act as a subject matter expert in machine learning, statistical modeling, and mentoring junior colleagues.
- Champion advanced analytics results to non-technical audiences.
- Employ cutting-edge analytical approaches to drive digital, data, and pharmaceutical product development.
- Develop computational and statistical methodologies for advanced analytics.
- Work closely with other disciplines across Sanofi to deliver cutting-edge analysis to key business questions.
- Apply a broad array of capabilities spanning machine learning, statistics, mathematics, modeling, simulation, text-mining/NLP, and data-mining.
- Collaborate with internal and external data scientists to scope and execute advanced analytics projects.
- Support the implementation of patient support strategies alongside other Sanofi teams.
Qualifications
- Masterโรรดs degree in Statistics, Data Science, Mathematics, Physics, Economics, Operations Research, Engineering, or a related quantitative field.
- 7+ years of work experience using analytics to solve product or business problems, coding (e.g., Python, R, SQL), querying databases, or statistical analysis, or a PhD degree.
- Experience with advanced ML techniques (neural networks/deep learning, reinforcement learning, SVM, PCA, etc.).
- Ability to interact with large-scale data structures (e.g., HDFS, SQL, NoSQL).
- Experience with big data analytics platforms or high-level ML libraries (e.g., H2O, SageMaker, Databricks, Keras, PyTorch, TensorFlow, Theano, DSSTNE).
- Experience with Federated Analytics (i.e. FlowerAI).
- Ability to prototype analyses and algorithms in high-level languages (e.g., GitHub, containers, Jupyter notebooks).
- Exposure to NLP technologies and analyses.
- Knowledge of data visualization technologies (e.g., ggplot2, Shiny, Plotly, D3, Tableau, Spotfire).
- Excellent knowledge of English language (spoken and written).
Skills
- Machine learning & AI leadership
- Statistical modeling
- Data visualization
- Python, R, SQL programming
- Big data platforms and ML libraries
- NLP and text mining
Education
- Masterโรรดs degree (or PhD) in a quantitative field as listed in Qualifications.