BenchSci logo

Software Engineer - Data Engineer

BenchSci
1 month ago
Full-time
On-site
Toronto, Ontario
IT

You Will:

  • Collaborate with Machine Learning, Full-stack engineers and Science to solve complex document mining challenges, helping us capture and model additional scientific experiments
  • Scale data pipelines to allow our data to go from research to platform quickly and reliably
  • Work with sources that contain both semi-structured and unstructured data
  • Use your experience to help define and apply best practices for a broad platform of technologies in a cloud-based environment
  • Architect and maintain robust data pipelines that ingest diverse sources and utilize LLMs for high-fidelity entity extraction into structured formats
  • Implement evaluation frameworks to monitor the accuracy, drift, and hallucination rates of extraction models within the production pipeline.
  • Lead or consult the authoring of engineering design proposals following the unified Platform Stream roadmap at BenchSci
  • Leverage a deep understanding of the business context and the team’s goals to unlock independent technical decisions in the face of open-ended requirements
  • Proactively identify new opportunities (from both internal and external sources) and advocate for and implement improvements to the current state of projects
  • Respond with urgency and drive urgency in own team to operational issues, owning resolution within one's sphere of responsibility
  • Challenge the status quo and propose newertechnologies or ways of working

You Have:

  • A degree in Computer Science/Engineering or a related field within science
  • 3+ years experience working as a software developerin the industry
  • Proficient with Python
  • Proficient with SQL
  • Experience using LLMs for structured data extraction
  • Experience with event-driven architecture with Pub/Sub
  • A track record in building high-quality, maintainable code
  • Experience with cloud computing (for example: GCP, Azure, AWS)

Nice To Have:

  • ML/Data science exposure
  • Worked with Auth0, Terraform
  • Have experience with data warehouse solutions like BigQuery, and databases including AlloyDB and Spanner
  • Have experience with agentic driven development and AI-based tools like Cursor or Claude Code
  • Have experience with building ConversationalAI solutions