Lead Data Engineer

Established in 2015, Create Music Group is a leading music and entertainment company. The company operates as a record label, distribution company, and entertainment network which generates over 15 billion music streams each month on DSP’s. Named #2 on the Inc 5000 Fastest Growth Companies in America in 2020, the company has grown exponentially by leveraging its owned IP with its media and technology platform. The company works with superstar artists, major and independent record labels, and global media brands. It operates a number of companies including Label Engine, one of the largest independent music distribution platforms in the world, with over 75,000 artists and 5,000 label clients; and Flighthouse, a digital entertainment brand focused on Gen Z,  which has more than 300 million followers across social media. Create Music Group is based in Hollywood, CA and has 400 employees worldwide.


Job Summary

The Lead Data Engineer will play a central role in the buildout of CMG's next-generation data platform. This is a high-ownership role on a small, senior team, working directly with the SVP of Data & AI to design and implement a scalable lakehouse architecture on Google Cloud Storage (GCS) and Databricks, spanning bronze, silver, and gold layers. The role emphasizes domain-driven design, data contracts, and proactive communication with both internal stakeholders and external vendors.


Responsibilities

  • Lead the technical design and implementation of CMG's Medallion 2.0 lakehouse architecture — bronze ingestion, silver transformation, and gold domain layers — built on GCS and Databricks (Delta Lake), with clear data contracts at each boundary
  • Design and manage data pipelines using Astro (Airflow), PySpark, and Delta Live Tables, ensuring reliability and scalability across ingestion and transformation layers
  • Govern the lakehouse using Databricks Unity Catalog — managing access controls, data lineage, and schema enforcement across domains
  • Apply domain-driven design principles to partition and model data domains (e.g., royalty, asset, artist, distribution)
  • Collaborate with the analytics team to ensure the gold layer reflects real business needs — reducing workarounds
  • Coordinate with external vendors (e.g., DataArt) and internal stakeholders across DevOps, product, and analytics
  • Proactively identify architectural risks, data quality issues, and dependency blockers with proposed resolutions
  • Maintain clear, impact-first documentation and status updates for both technical and non-technical stakeholders
  • Other duties as assigned

Qualifications 

  • 4+ years of data engineering experience, with at least 1–2 years focused on data platform or lakehouse architecture
  • Hands-on experience with Databricks — including Delta Lake, PySpark, and ideally Unity Catalog
  • Experience with GCS or equivalent cloud object storage as a lakehouse foundation layer
  • Hands-on experience with domain-driven design applied to data modeling
  • Strong command of SQL and at least one transformation framework (dbt preferred)
  • Experience with medallion or lakehouse architectures (bronze/silver/gold or equivalent)
  • Familiarity with GCP-native tooling — Pub/Sub, Dataflow, or Dataplex a plus
  • Excellent written communication — able to write design docs non-engineers can understand and status updates executives can act on
  • Demonstrated ability to work independently in ambiguous environments
  • Track record of flagging risks early with proposed solutions

Nice to have: Experience in music/media/entertainment data; familiarity with data contracts or schema validation (Unity Catalog, Great Expectations, dbt tests); experience with external dev vendors


Pay Scale

  • $120,000 - $150,000 CAD per year
  • The final compensation within this range will be determined based on the candidate’s experience, skills, and overall fit for the role.

Data & AI

Canada

Share on:

Terms of servicePrivacyCookiesPowered by Rippling