Johnson & Johnson logo

Principal Scientist, Data Science – R&D DSDH - Therapeutics Development & Supply (TDS)

Johnson & Johnson
5 months ago
Remote friendly (Malvern, PA)
United States
IT
Position Summary
- Design, build, and optimize data capture, processing, and storage solutions enabling advanced analytics, digital process transformation, and AI/ML applications across the development-to-supply continuum for Therapeutics Development & Supply (TDS).
- Hands-on technical contribution across Process Development, Manufacturing, Supply Chain, Quality, and Digital/Data Science teams to deliver AI-ready data pipelines and data products.

Key Responsibilities
- Data Engineering & Pipeline Development
- Design, build, and maintain scalable data pipelines to acquire, integrate, and manage TDS data from diverse sources/systems (e.g., lab systems, MES, clinical supply, quality systems, external partners).
- Create and optimize data flows for structured and unstructured data using Python, R, SQL, cloud services, and modern engineering tools.
- Develop and maintain TDS-specific data repositories with enterprise-level data models.
- Ensure AI/ML readiness via structured, versioned, traceable data semantically aligned to enterprise standards.
- Data Product & Architecture Partnership
- Translate business needs into data products and engineering requirements in partnership with data scientists, domain experts, and digital technology teams.
- Partner with ontology/knowledge graph teams to implement semantic models and future-proof architectures.
- Quality, Compliance & Performance
- Implement data quality/performance standards; define KPIs for accuracy, completeness, and consistency.
- Apply data versioning and lineage tracking for compliance, traceability, and audit readiness.
- Follow software development best practices (code versioning, DevOps, documentation).
- Cross-Functional Collaboration
- Engage stakeholders to understand requirements, design solutions, and drive adoption.
- Manage multiple concurrent projects and deliver maximum business value.

Qualifications
Required
- Advanced degree in Engineering, Data Science, Life Sciences, Computer Science, or related field (advanced degree preferred).
- 3+ years of data engineering experience (data modeling and database design), preferably in scientific, manufacturing, or healthcare environments.
- Proficiency with Python, R, SQL, and cloud-based architectures (e.g., AWS, Snowflake, Redshift).
- Experience with NoSQL and graph databases.
- Strong analytical/problem-solving and stakeholder-management skills; ability to translate discussions into actionable requirements.
- Ability to drive multiple projects with strong organizational skills and adaptability.

Preferred
- Experience in regulated or standards-driven data environments (e.g., CDISC, HL7, FHIR, OMOP, DICOM, manufacturing/quality standards).
- Familiarity with high-dimensional data (e.g., imaging, sensor data).
- Experience with principles connecting to/funding MLOps and model deployment workflows.
- Knowledge of manufacturing systems (MES), laboratory information systems, or industrial data systems.
- Exposure to knowledge graph or ontology-driven architectures.

Required Skills
- Advanced Analytics, Data Analysis, Data Quality, Data Reporting, Data Science, Data Visualization, Digital Fluency, Critical Thinking, Strategic Thinking, Technical Credibility, Process Improvements, Workflow Analysis, Organizing, Coaching, Data Privacy Standards, Econometric Models, Data Savvy.

Preferred Skills
- (Same list as β€œRequired Skills” as provided in the posting.)

Application Instructions
- Candidate interested in US based locations, please apply to: R-069212