Caris Life Sciences logo

Data Scientist - Deep Learning (Hybrid)

Caris Life Sciences
Remote friendly (Irving, TX)
United States
Clinical Research and Development

Role Summary

Caris Life Sciences is seeking a creative, driven, and technically strong Data Scientist – Deep learning to join our Computational Pathology team. This role focuses on developing large-scale, generalizable machine learning models that learn rich representations from complex, high-dimensional data to support translational research and biomarker discovery. The successful candidate will play a central role in shaping Caris’s next-generation AI capabilities by designing scalable training pipelines, advancing representation learning approaches, and collaborating closely with scientific and clinical experts. This position is ideal for individuals with a strong background in deep learning, transformer-based architectures, and computational pathology, who are excited about building foundation-level modeling frameworks rather than task-specific solutions.

Responsibilities

  • Design, train, and evaluate foundation-style machine learning models that learn robust and reusable representations from large-scale datasets.
  • Develop and maintain scalable model training infrastructure using PyTorch and distributed training paradigms (e.g., multi-GPU and multi-node setups).
  • Train and adapt transformer-based architectures for representation learning across diverse data sources.
  • Apply self-supervised, weakly supervised, and representation learning techniques to leverage partially labeled or unlabeled data.
  • Build flexible modeling frameworks capable of integrating multiple data sources and heterogeneous signals.
  • Collaborate with pathologists, scientists, and engineers to ensure models are biologically meaningful and aligned with translational research goals.
  • Process, curate, and analyze large, complex datasets using efficient and reproducible workflows.
  • Support exploratory analyses, downstream modeling, and internal research initiatives using learned representations.
  • Contribute to internal technical documentation, research outputs, and long-term modeling strategy.
  • Follow best practices in software engineering, experiment tracking, and collaborative model development.

Qualifications

  • Required: PhD in Computer Science, Data Science, Computational Biology, Bioinformatics, Engineering, Mathematics, or a related quantitative field with exposure to biological or medical data.
  • Required: 0–4 years of experience applying machine learning or deep learning in research or industry settings (postdoctoral experience acceptable).
  • Required: Strong understanding of deep learning model training, optimization, and evaluation.
  • Required: Hands-on experience with transformer-based models, including both language-focused and vision-focused architectures.
  • Required: Proficiency in Python and PyTorch.
  • Required: Hands-on experience with distributed training (e.g., PyTorch DDP, multi-GPU or multi-node workflows).
  • Required: Experience working in Linux environments and using Git for version control.
  • Required: Ability to work with large datasets and complex data pipelines.
  • Required: Strong written and verbal communication skills.
  • Preferred: Background in computational pathology or experience working with large-scale imaging data.
  • Preferred: Experience training large representation models or foundation models.
  • Preferred: Familiarity with self-supervised and representation learning techniques, such as contrastive learning, DINO-style approaches, or related methods.
  • Preferred: Experience working with multiple data sources in unified modeling frameworks.
  • Preferred: Experience with cloud-based machine learning environments, including distributed training workflows (e.g., AWS, SageMaker).
  • Preferred: Strong engineering mindset with attention to reproducibility, scalability, and model robustness.
  • Preferred: Background in biomedical, translational, or applied research environments.

Additional Requirements

  • Physical Demands: This position requires extended periods of computer-based work, along with collaboration with subject matter experts and business partners in person or via remote conferencing.
  • Travel: Periodic travel may be required.