GSK logo

Senior Principal Software Engineer - R&D Tech

GSK
Remote friendly (Collegeville, PA)
United States
IT

Role Summary

Senior Principal Software Engineer with broad expertise across software development, data engineering, cloud architecture, and AI/ML technologies. This is a hands-on technical role where you'll spend the majority of your time writing code, building data pipelines, architecting cloud-native solutions, and integrating AI/ML capabilities into production applications. You’ll be a versatile engineer who can work across the full stack, understand data flows, leverage cloud services effectively, and apply AI/ML techniques to solve real-world problems.

Responsibilities

  • Write production-grade code for full-stack applications using Python and modern frontend frameworks
  • Build and maintain scalable REST APIs and microservices architectures
  • Design application architectures and implement technical solutions
  • Develop user interfaces and data visualization components
  • Write comprehensive tests and ensure code quality
  • Debug and optimize application performance
  • Design and architect cloud-native applications and solutions on Azure
  • Leverage Azure services including App Services, Azure Functions, AKS, Storage, Data Factory, Cosmos DB
  • Implement scalable, resilient, and cost-effective cloud architectures
  • Optimize cloud resource utilization and performance
  • Design for high availability, disaster recovery, and security
  • Implement cloud security best practices and governance
  • Build and maintain data pipelines for large-scale data processing
  • Implement ETL/ELT processes for diverse data sources
  • Optimize data workflows and processing performance
  • Design and implement data models and schemas
  • Work with structured and unstructured data at scale
  • Integrate AI/ML models and APIs into production applications
  • Build GenAI applications using LLMs and frameworks like LangChain
  • Implement RAG (Retrieval Augmented Generation) architectures
  • Work with vector databases for semantic search capabilities
  • Apply prompt engineering techniques for optimal LLM performance
  • Understand and implement basic NLP tasks (text classification, entity extraction, embeddings)
  • Collaborate with data scientists to productionize ML models
  • Evaluate and integrate new AI/ML technologies
  • Write SQL queries for data analysis and application needs
  • Design and optimize database schemas for both relational and NoSQL databases
  • Tune query performance and implement indexing strategies
  • Implement data access patterns and ORM frameworks
  • Implement Infrastructure as Code and CI/CD pipelines
  • Containerize applications and orchestrate deployments with Docker and Kubernetes
  • Implement monitoring, logging, and alerting solutions
  • Automate deployment and operational processes
  • Ensure application scalability and reliability
  • Work closely with data scientists, engineers, and product owners across R&D
  • Participate in code reviews and knowledge sharing
  • Contribute to technical discussions and solution designs
  • Identify innovations and architect solutions
  • Evaluate and integrate new technologies

Qualifications

  • Required: Bachelor's degree in Computer Science or equivalent relevant industry experience
  • Required: Significant hands-on software development experience with demonstrated progression in technical complexity
  • Required: Expert-level Python programming with extensive production application development experience
  • Required: Strong full-stack development experience with modern frameworks:
    • Backend: Python (FastAPI, Flask, Django)
    • Frontend: React, Next.js, TypeScript, or similar modern frameworks
  • Required: Cloud services experience, preferably Azure (App Services, Functions, Storage, or equivalent cloud services)
  • Required: Strong SQL skills: Writing complex queries, data modeling, and optimization
  • Required: Data engineering fundamentals: Building data pipelines and working with large datasets
  • Required: Understanding of AI/ML concepts and practical experience:
    • Familiarity with LLMs and GenAI applications
    • Basic understanding of how to integrate AI/ML APIs into applications
    • Knowledge of prompt engineering basics
    • Understanding of RAG architectures or willingness to learn quickly
  • Required: Experience building production-grade applications: Scalable, maintainable, well-tested code
  • Required: Understanding of software architecture: Design patterns, microservices, distributed systems, cloud-native architectures
  • Required: Version control with Git and collaborative development workflows
  • Required: DevOps practices: CI/CD pipelines, containerization basics
  • Required: Agile development practices and iterative development
  • Required: Excellent problem-solving and debugging skills
  • Required: Strong communication and collaboration skills
  • Required: Ability to quickly learn and adapt to new technologies
  • Highly Desired Skills: Azure cloud platform expertise: Deep knowledge of Azure services (App Services, Azure Functions, AKS, Storage Accounts, Azure Data Factory, Cosmos DB, Azure SQL, Key Vault, Application Insights)
  • Cloud architecture and design: Designing scalable, secure, and cost-effective cloud solutions
  • Databricks and Apache Spark for large-scale data processing
  • Hands-on experience with GenAI platforms: OpenAI, Azure OpenAI, LangChain, or similar frameworks
  • Experience building RAG applications with chunking, vectorization, retrieval strategies
  • Vector databases: pgvector, Pinecone, Weaviate, or similar
  • DevOps maturity: Infrastructure as Code (Terraform, Bicep, ARM templates), advanced CI/CD
  • Containerization and orchestration: Docker and Kubernetes (AKS)
  • Database expertise: PostgreSQL, SQL Server, Azure SQL with performance tuning
  • Cloud security: Identity management, RBAC, network security, encryption
  • Azure DevOps or GitHub Actions for CI/CD pipelines
  • Experience with REST API design and microservices patterns
  • Preferred Qualifications: Azure certifications (Azure Solutions Architect, Azure Developer, Azure Data Engineer)
  • Advanced AI/ML knowledge:
    • Experience with ML frameworks (TensorFlow, PyTorch, Hugging Face)
    • Understanding of model training and evaluation
    • Knowledge of NLP techniques beyond basic text processing
    • Experience with multi-agent systems or advanced RAG patterns
  • MLOps knowledge: Model deployment, versioning, monitoring, A/B testing
  • Azure AI services: Document Intelligence, Cognitive Search, Azure AI Studio, Azure Machine Learning
  • Search technologies: Azure Search, Sinequa, Elasticsearch, Lucene-based systems
  • Advanced Spark optimization and performance tuning
  • Real-time data processing and streaming architectures (Kafka, Azure Event Hubs)
  • Pharmaceutical, healthcare, or regulated industry experience
  • Experience with compliance requirements: HIPAA, GxP, 21 CFR Part 11
  • Experience with data visualization libraries (D3.js, Plotly, Chart.js)
  • Software security best practices and secure coding
  • FinOps practices: Cloud cost optimization and management
  • Experience mentoring junior engineers

Education

  • Bachelor's degree in Computer Science or equivalent relevant industry experience