Amgen logo

Principal Data Scientist - AI Context Architect (Semantic & Context Engineering)

Amgen
Remote friendly (United States)
United States
IT

Role Summary

Principal Data Scientist – AI Context Architect (Semantic & Context Engineering) at Amgen. Focuses on semantic modeling, context engineering, and AI-first data science to enable high-performing ML, RL-informed approaches, and generative AI systems through well-architected context.

Responsibilities

  • Semantic architecture & AI-first context modeling: Define enterprise-grade semantic representations for healthcare/life-sciences concepts and specify how relationships and interactions are represented for AI consumption.
  • Create and maintain semantic schemas / ontologies / knowledge-graph models describing entities, attributes, constraints, and linkages—optimized for analytics and AI reasoning.
  • Establish context engineering standards: how data is shaped into prompts, tools, memory, retrieval indices, and structured outputs so models behave consistently across use cases.
  • Feature engineering & model performance: Lead feature engineering strategy tied to model performance, including feature definition, transformations, leakage prevention, stability monitoring, and explainability.
  • Perform exploratory data analysis on complex, high-dimensional datasets to identify predictive signals and context variables that improve model robustness and generalization.
  • Context-aware ML, GenAI, and reinforcement learning–informed approaches: Build and evaluate context-aware ML/GenAI solutions, integrate semantic layers with retrieval, tools, and structured outputs.
  • Apply reinforcement learning concepts to improve decisioning, ranking, orchestration, and system behavior, without overfitting to short-term metrics.
  • Prototype and benchmark algorithms (classical ML, deep learning, LLM-based reasoning) and advise on scalability and production readiness.
  • Retrieval, knowledge, and governance foundations: Architect retrieval and memory patterns (RAG, vector stores, knowledge graphs, session memory) and define data quality and semantic quality gates that impact downstream model reliability.
  • Cross-functional leadership: Translate domain needs into semantic + AI roadmaps, aligning stakeholders on definitions, metrics, and tradeoffs; mentor and guide teams on context engineering and feature excellence.

Qualifications

  • Basic Qualifications:
  • Doctorate degree and 2 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
  • Or Master’s degree and 4 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
  • Or Bachelor’s degree and 6 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
  • Or Associate’s degree and 10 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
  • Or High school diploma / GED and 12 years of Data Science, Computer Science, Statistics, Applied Math, or related experience
  • Preferred Qualifications:
  • 10–12+ years applying data science in enterprise environments with demonstrated principal-level influence (or equivalent depth of expertise).
  • Deep expertise in semantic modeling: ontologies, taxonomies, entity resolution, knowledge graphs, metadata and data contracts—built for operational use.
  • Strong understanding of machine learning fundamentals and performance drivers, especially feature engineering and evaluation rigor.
  • Practical experience implementing RAG / retrieval / vector search / knowledge graph solutions with clear governance patterns.
  • Working knowledge of reinforcement learning concepts and how they apply to ranking, orchestration, personalization, or decision systems (even if not “pure RL” production).
  • Proficiency in Python (and strong comfort with modern data/ML stacks); ability to collaborate effectively with engineering teams on production concerns.
  • Exceptional stakeholder management: can drive alignment on relationships and metrics, and communicate tradeoffs clearly.

Skills

  • Soft Skills: Excellent analytical and troubleshooting skills; strong verbal and written communication; ability to work effectively with global, virtual teams; initiative and self-motivation; ability to manage multiple priorities; team-oriented with a focus on achieving goals; quick learner, organized, detail-oriented; strong presentation and public speaking skills.
  • Certifications: Cloud/AI certifications (AWS/Azure/GCP) are a plus.
  • Good-to-Have Skills: Experience in biotech/pharma and healthcare concepts; familiarity with agentic/tool-using LLM patterns, prompt management, and structured outputs; experience with feature stores, ML observability, and robust evaluation tooling; publications or thought leadership in semantic AI / knowledge systems / enterprise GenAI.

Education

  • Degree requirements as listed under Basic Qualifications ( doctorate/master/bachelor/associate/high school )