About Baton Health
At Baton Health, we’re rethinking healthcare credentialing from the ground up. Today’s process is slow, expensive, and riddled with manual checks—verifying a clinician’s background often means gathering the same information from hundreds of sources, again and again. We believe there’s a better way.
We’re building the Universal Primary Source: a single, reliable platform that federates data from thousands of primary sources. No more chasing paperwork. Just clean, trustworthy data—fast.
In just a short time, we’ve developed:
- A high-performance ingestion system that pulls data from across the healthcare ecosystem
- A schema normalization engine that reconciles even the messiest formats
- A robust, scalable data model that’s built to handle complexity and return results in milliseconds
Our stack is modern by design—Python, dbt, Snowflake, Prefect, Postgres—because the right tools matter. We’re moving quickly, questioning assumptions, and bringing credentialing into the 21st century.
If you’re excited by big, meaningful challenges and want to build systems that actually improve healthcare, we’d love to talk.
Why This Role Matters
Credentialing is one of the most manual, fragmented processes in healthcare—and we’re building the infrastructure to fix it. As a Senior Data Engineer, your work will directly influence our ability to deliver trustworthy, real-time insights to customers, while also pushing forward innovations in healthcare data automation.
What you'll do
- Design, build, and maintain scalable, observable data pipelines that power our core product and enable real-time, automated credentialing.
- Improve data quality and consistency by refining entity resolution systems, record deduplication, and reconciliation across a high volume of disparate data sources.
- Develop modular, testable data models in a Snowflake + dbt environment, ensuring transparency, traceability, and auditability at every step.
- Create and manage orchestration workflows (e.g., Prefect) to coordinate ingestion, transformation, and downstream processing.
- Collaborate with Product and Engineering teams to bring data-driven features to life—supporting customer workflows and unlocking new business opportunities.
- Contribute to data enrichment and anomaly detection efforts, including opportunities to apply AI/ML techniques where appropriate.
- Ensure strong data governance practices, including maintaining source fidelity, audit trails, and chain of custody for sensitive data.
- Support external integrations through tools like Snowflake Marketplace, Fivetran, and custom APIs.
What You’ll Need
- 5+ years as a Data Engineer focused on data ingestion, extraction, modeling, and transformation using modern data stacks.
- Deep experience with dbt, Snowflake (or similar cloud warehouses), Fivetran or equivalent ETL tools.
- Familiarity with reverse ETL tools (e.g. Hightouch) is a plus.
- Proficiency with orchestration tools such as Prefect.
- Experience working alongside data science or AI teams is a plus.
Skills
- Manages Complexity: Can analyze and make sense of large, diverse, and sometimes contradictory data sets to solve tough problems.
- Drives Results: Persists through challenges, consistently delivers outcomes, and exceeds goals.
- Communicates & Collaborates: Adjusts communication style for different audiences, encourages open dialogue, and promotes team alignment.
The pay range for this role is:
130,000 - 180,000 USD per year (Remote (New York, US))