Sleuth Insights

Staff Data/ML Engineer

About us

Sleuth is building a modern, agentic, and intelligent decision-making platform for the biopharma and life sciences industry. We’re using AI to automate workflows and deliver crucial insights and bespoke reports that answer our user’s critical questions about their investments.

You should join us, because:

  • Traction: we’ve generated outsized demand for a startup of our size, and have already signed deals with some of the world’s leading biopharma companies.
  • Talent density: we’re growing thoughtfully and only work with incredibly smart, driven people.
  • Velocity: to meet the demand we’ve generated, we ship fast. You’ll learn a lot and constantly take on new challenges.
  • Frontier technology & product: we’re developing cutting-edge AI systems. You’ll truly be building the future of how biopharma and life sciences companies generate insights to power their business.

About the role

We are looking for an experienced Data and ML Engineer to lead the development and operation of our data platform and machine learning infrastructure. In this role, you’ll be working with data scientists and engineers in the team to develop and operate robust data pipelines, manage and optimize data stores (relational, vector, and graph databases), and develop the infrastructure and tooling that supports model training, fine-tuning, and deployment. This is a unique opportunity to go both broad and deep across data engineering and ML infrastructure to enable a modern GenAI system that serves the biopharma and life sciences industry.

What you'll do

  • Work with other members of the team to design, develop, and operate data and AI solutions as part of our products; specifically, how to scale the most comprehensive and accurate biopharma intelligence knowledge base.
  • Fine-tune and/or evaluate models for accuracy, safety, and relevance.
  • Apply techniques like few-shot learning, prompt chaining, and retrieval-augmented generation (RAG) to enhance knowledge retrieval.
  • Leverage data integration tools, or develop custom ones if needed, to connect to public and private data repositories.
  • Design, develop, and operate a scalable data platform that ingests, processes, and serves a large amount of structured and unstructured public and private data.
  • Develop and deploy a user feedback collection and data annotation mechanism.
  • Develop and deploy CI/CD for model and data pipelines, continuous testing, observability, and monitoring systems to guarantee the integrity and resilience of our data platform.
  • Establish data governance and related controls to ensure the confidentiality, integrity, and availability of our data.
  • Set data/ML engineering best practices.

What we're looking for

  • Experience: 10+ years of data and AI/ML engineering experience in the industry.
  • Infrastructure: experience with Cloud infrastructure and tools (AWS and/or GCP).
  • Stack: expert-level proficiency in Python and Spark
  • Data stores: relational databases (specifically PostgreSQL), graph databases (e.g., Neo4J, AWS Neptune), and vector databases (e.g., Pinecone, Chroma).
  • AI/ML: understanding of modern AI applications and experience in fine-tuning and evaluating AI models and model-based solutions with techniques like RAG, prompt chaining, or agentic workflows using frameworks and protocols such as LangChain, LangGraph, LangSmith, AutoGen, MCP, etc.
  • DataOps: experience in data platform architecture, data pipelines, and commercial and open source ETL/ELT tools.
  • MLOps: experience in model deployment and serving models, CI/CD pipelines, containerization and orchestration (e.g. Airflow, Nextflow), AI/ML platforms (e.g. Vertex AI).
  • Compliance: familiar with SOC2 or regulated software development environments.
  • Education: BS, MS, or PhD in Computer Science, Engineering, Math, or related scientific field. Additional hands-on certificates are great to have.
  • Domain Knowledge: familiarity with biopharma, biotech, or life sciences environments is a plus.

What we offer

  • Competitive compensation, healthcare benefits with generous employer contribution, and flexible hybrid/remote work setup.
  • Hands-on experience at the frontier of AI, building agentic systems that don’t just support but transform how insights are generated and applied.
  • An opportunity to directly partner with leading biopharma companies and see your work shape how this industry makes billion-dollar decisions using our software.

The pay range for this role is:

180,000 - 205,000 USD per year (Remote (San Francisco, California, US))

Engineering

Remote (San Francisco, California, US)

Hybrid (Los Angeles, California, US)

Remote (Seattle, Washington, US)

Share on:

Terms of servicePrivacyCookiesPowered by Rippling