Role Summary
Principal Data Scientist – AI Context Architect (Semantic & Context Engineering) at Amgen. Focuses on semantic modeling, context engineering, and AI-first data science to enable high-performing ML, RL-informed approaches, and generative AI systems through well-architected context.
Responsibilities
- Semantic architecture & AI-first context modeling: Define enterprise-grade semantic representations for healthcare/life-sciences concepts and specify how relationships and interactions are represented for AI consumption.
- Create and maintain semantic schemas / ontologies / knowledge-graph models describing entities, attributes, constraints, and linkages—optimized for analytics and AI reasoning.
- Establish context engineering standards: how data is shaped into prompts, tools, memory, retrieval indices, and structured outputs so models behave consistently across use cases.
- Feature engineering & model performance: Lead feature engineering strategy tied to model performance, including feature definition, transformations, leakage prevention, stability monitoring, and explainability.
- Perform exploratory data analysis on complex, high-dimensional datasets to identify predictive signals and context variables that improve model robustness and generalization.
- Context-aware ML, GenAI, and reinforcement learning–informed approaches: Build and evaluate context-aware ML/GenAI solutions, integrate semantic layers with retrieval, tools, and structured outputs.
- Apply reinforcement learning concepts to improve decisioning, ranking, orchestration, and system behavior, without overfitting to short-term metrics.
- Prototype and benchmark algorithms (classical ML, deep learning, LLM-based reasoning) and advise on scalability and production readiness.
- Retrieval, knowledge, and governance foundations: Architect retrieval and memory patterns (RAG, vector stores, knowledge graphs, session memory) and define data quality and semantic quality gates that impact downstream model reliability.
- Cross-functional leadership: Translate domain needs into semantic + AI roadmaps, aligning stakeholders on definitions, metrics, and tradeoffs; mentor and guide teams on context engineering and feature excellence.
Qualifications
- Basic Qualifications:
- Doctorate degree and 2 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
- Or Master’s degree and 4 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
- Or Bachelor’s degree and 6 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
- Or Associate’s degree and 10 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
- Or High school diploma / GED and 12 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
- Preferred Qualifications:
- 10–12+ years applying data science in enterprise environments with demonstrated principal-level influence (or equivalent depth of expertise).
- Deep expertise in semantic modeling: ontologies, taxonomies, entity resolution, knowledge graphs, metadata and data contracts—built for operational use.
- Strong understanding of machine learning fundamentals and performance drivers, especially feature engineering and evaluation rigor.
- Practical experience implementing RAG / retrieval / vector search / knowledge graph solutions with clear governance patterns.
- Working knowledge of reinforcement learning concepts and how they apply to ranking, orchestration, personalization, or decision systems (even if not “pure RL” production).
- Proficiency in Python (and strong comfort with modern data/ML stacks); ability to collaborate effectively with engineering teams on production concerns.
- Exceptional stakeholder management: can drive alignment on relationships and metrics, and communicate tradeoffs clearly.
Skills
- Soft Skills: Excellent analytical and troubleshooting skills; strong verbal and written communication; ability to work effectively with global, virtual teams; initiative and self-motivation; ability to manage multiple priorities; team-oriented with a focus on achieving goals; quick learner, organized, detail-oriented; strong presentation and public speaking skills.
- Certifications: Cloud/AI certifications (AWS/Azure/GCP) are a plus.
- Good-to-Have Skills: Experience in biotech/pharma and healthcare concepts; familiarity with agentic/tool-using LLM patterns, prompt management, and structured outputs; experience with feature stores, ML observability, and robust evaluation tooling; publications or thought leadership in semantic AI / knowledge systems / enterprise GenAI.
Education
- Degree requirements as listed under Basic Qualifications ( doctorate/master/bachelor/associate/high school )