Lead Architect - Data and AI Engineering (Location: INDIA, Hybrid)

About Aubrant 


Aubrant Digital is a leader in multi-shore custom application development. We are passionate about solving our clients’ business problems through consultative teamwork, innovative software, and proven processes. We’ve served more than 50 clients and delivered hundreds of high quality, custom enterprise applications.  Our clients value us as integral team members who get the job done on time and on spec, and we are proud of our high client retention rate and under 2% staff turnover. With offices in New Jersey, Boston, Costa Rica, and Eastern Europe, we execute the full software lifecycle, from architecture and design through development, QA and application maintenance & support.  Our company culture emphasizes client service, trust-based relationships, and innovation. 

Position Overview 

The Lead Architect owns the end-to-end delivery of the data and AI engineering work on a flagship multi-year enterprise data transformation program. The program is building a unified, governed data foundation on Azure across multiple business domains, with real-time CDC ingestion, master data management, and AI-ready analytics, with retrieval, AI, and agentic workloads built on top of that foundation. This is a builder-leader role: you act as the technical bridge between the customer's senior technology leadership and the Aubrant delivery team, write modeling decisions, get hands-on with Databricks and pipeline code as you lead a data and AI engineering team, and pressure-test the team's QA approach yourself. Beyond client delivery, this role serves as an internal technical leader and coach across Aubrant's Data & AI Studio — raising the bar on AI engineering and agentic patterns, and mentoring engineers.


Delivery Ownership & Execution 

  • Own end-to-end delivery of the data transformation against agreed architecture, requirements, and schedule 
  • Translate the architecture and Unified Data Model into an executable plan: source onboarding, ingestion patterns, ELT design, serving patterns, and quality gates 
  • Drive sprint planning, milestone tracking, and execution across the program's phased delivery 
  • Identify risks, dependencies, and blockers early; drive resolution and manage scope and timeline commitments 

Customer & Stakeholder Engagement 

  • Act as the day-to-day technical point of contact for customer leadership and engineering on progress, blockers, decisions, and solution alternatives 
  • Run technical working sessions, design reviews, and walkthroughs that move decisions forward 
  • Translate business context into technical implications, and technical complexity into clear leadership-ready summaries 

Architecture, Modeling & Engineering 

  • Hold a working understanding of the full target tech stack and validate that implementation choices stay consistent with the reference architecture 
  • Lead and contribute to data modeling across the core enterprise domains; review modeling work for identity, SCD, CDC, PII, and survivorship correctness 
  • Build production-grade ETL/ELT pipelines on Azure Databricks (PySpark, Spark SQL) with Delta Lake: ingestion, conformance, survivorship, and quality test layers 
  • Configure and extend Airbyte connectors for CDC ingestion and integrate API-based sources across SaaS, ERP, HRIS, and operational systems 
  • Apply Aubrant Workbench accelerators to compress build time and ensure consistency 
  • Lead the AI and agentic engineering patterns that sit on top of the data platform: retrieval pipelines, vector indexes, embedding generation, feature stores, and evaluation harnesses for LLM-backed and agentic workloads
  • Partner with AI Engineers to operationalize models and agents in production: MLOps lifecycle, prompt and eval versioning, observability, safety and cost guardrails, and clear handoffs between data, model, and application layers

Infrastructure, DevOps & Quality 

  • Partner with the cloud and DevOps team on what the data team needs from the platform: workspace topology, network and identity, secret management, observability, and cost guardrails 
  • Ensure CI/CD pipelines for data assets are in place and used: unit and integration tests, lineage validation, environment promotion, automated deployment, and infrastructure-as-code discipline 
  • Define the QA approach: data quality rules, test data strategy, regression testing, reconciliation against sources, and acceptance criteria for golden records 
  • Instruct and review QA work; hold the line on quality gates between Bronze, Silver, Gold tiers and Dev, Test, Prod environments 

Leadership & Coordination 

  • Lead and coordinate a cross-functional pod including: 
  • Data Architects 
  • AI / Agentic Engineers
  • Data Modeler 
  • Senior Cloud Engineer 
  • Data Engineers 
  • QA Engineers 
  • Support Agile ceremonies, backlog prioritization, and remove blockers 
  • Mentor Studio Members and codify reusable patterns into the Studio knowledge base and the Aubrant Workbench across both data engineering and AI / agentic engineering disciplines

Key Qualifications 

Experience 

  • 12+ years in data engineering and data platform delivery, with 5+ years in a Technical Lead or equivalent role on customer-facing engagements 
  • Multiple end-to-end deliveries of enterprise-scale data platforms, with a track record of delivering against architecture, schedule, and quality 

Required Technical Skills 

  • Azure Databricks (PySpark, Spark SQL), Delta Lake, the Medallion architecture, and ADLS Gen2: hands-on production experience 
  • Data modeling: conceptual, logical, and physical, including SCD strategy, CDC patterns, PII classification, and survivorship 
  • CDC and ingestion: production experience with Airbyte, Fivetran, Azure Data Factory, or equivalent, plus API-based source onboarding 
  • At least one of Azure Synapse, Cosmos DB, or Azure SQL Managed Instance for serving patterns 
  • CI/CD for data assets and infrastructure-as-code (Terraform, Bicep, or ARM) 
  • QA approach design and data quality engineering for enterprise data platforms 
  • Hands-on production experience with at least one agentic framework (LangGraph, LangChain, Semantic Kernel, or equivalent) and one major LLM provider (Azure OpenAI, Anthropic, or comparable), including tool use, multi-step orchestration, and structured outputs
  • Production patterns for retrieval-augmented generation (RAG): chunking and embedding strategies, vector stores (Azure AI Search, pgvector, or equivalent), hybrid retrieval, and evaluation harnesses for LLM and agent quality

Leadership & Communication 

  • Customer-facing presence: able to run a technical conversation with a VP of Technology and walk out with a decision 
  • Strong written technical communication: design memos, decision logs, and runbooks 
  • Demonstrated ability to mentor engineers and grow technical capability in a team 

Preferred Qualifications 

  • Databricks Certified Data Engineer Professional or Microsoft Azure Data Engineer Associate / Solutions Architect Expert 
  • Microsoft Purview or comparable governance and catalog tooling (Collibra, Atlan, Unity Catalog) 
  • MLflow lifecycle experience or GenAI / LLM integration patterns in production 
  • Exposure to regulated, compliance-heavy industries (HIPAA, SOC 2, GDPR, PCI DSS) 
  • Bachelor's or Master's degree in Computer Science, Engineering, or related field 

Built on Azure Databricks, Delta Lake, ADLS Gen2, Airbyte, Microsoft Purview, Azure Synapse, Cosmos DB, MLflow, and Power BI.

Delivery

India

Compartir en:

Condiciones del servicioPrivacidadCookiesDesarrollado por Rippling