Pythian

Data Engineer

Why Pythian:


At Pythian, we are experts in strategic database and analytics services, driving digital transformation and operational excellence. Pythian, a multinational company, was founded in 1997 and started by ensuring the reliability and performance of mission-critical databases. We quickly earned a reputation for solving tough data challenges. We were there when the industry moved from on-premises to cloud environments, and as enterprises sought more from their data, we expanded our competencies to include advanced analytics.


Today, we empower organizations to embrace transformation and leverage advanced technologies, including AI, to stay competitive. We deliver innovative solutions that meet each client’s data goals and have built strong partnerships with Google Cloud, AWS, Microsoft, Oracle, SAP, and Snowflake. The powerful combination of our extensive expertise in data and cloud and our ability to keep on top of the latest bleeding edge technologies make us the perfect partner to help mid and large-sized businesses transform to stay ahead in today’s rapidly changing digital economy.


Why you:


As a Senior Data Engineer, you will collaborate with a globally distributed team of architects, engineers, and consultants to design and deliver impactful solutions for enterprise data platforms, primarily focused on cloud technologies. Your role will involve producing outcomes for real-world customer projects, contributing to software artifacts, and driving automation in data platform implementations and migrations.


If this is you, and you wonder what it would be like to work at Pythian, reach out to us and find out!  


Intrigued to see what a life is like at Pythian? Check out #pythianlife on LinkedIn and follow @loveyourdata on Instagram!


Not the right job for you? Check out what other great jobs Pythian has open around the world! Pythian Careers


What you will be doing:


  • Design and develop end-to-end cloud-based solutions with a strong emphasis on data applications and infrastructure.
  • Lead discovery and design sessions with customers to gather requirements and translate functional needs into detailed designs.
  • Create and contribute to technical design documents and other project-related documentation.
  • Work with stakeholders to identify technical and business requirements, and apply best practices and standards to achieve successful project outcomes.
  • Regularly demonstrate proficiency in established practices and standards for cloud solutions.
  • Write high-performance, reliable, and maintainable code.
  • Develop test automation frameworks and associated tooling to ensure project success.
  • Handle complex and diverse cloud-based projects, including tasks such as collecting, managing, analyzing, and visualizing very large datasets.
  • Build efficient and scalable data pipelines for batch and real-time use cases across various source and target systems.
  • Optimize ETL/ELT pipelines, troubleshoot pipeline issues, and enhance observability dashboards.
  • Execute data pipeline-specific DevOps activities, such as IaC provisioning, implementing data security, and automation.
  • Analyze potential issues, perform root cause analyses, and resolve technical challenges.
  • Review bug descriptions, functional requirements, and design documents to ensure comprehensive testing plans and cases.
  • Performance tuning of batch and real-time data processing pipelines.
  • Ensure security best practices are followed when working on internal and customer-facing cloud data platforms.
  • Build foundational CI/CD pipelines for all infrastructure components, data pipelines, and custom data applications.
  • Develop observability and data quality solutions for data platforms, including ML and AI applications.
  • Act as a trusted advisor for customers, addressing technical queries and providing support.
  • Engage in thought leadership activities such as whitepaper authoring, conference presentations, and podcasting.
  • Suggest and implement ways to improve project progress and efficiency.
  • Participate in pre-sales activities when required.


What we need from you:


Behavioral Expectations:

  • Demonstrate professional and respectful conduct in all interactions with customers, peers, and stakeholders.
  • Manage time effectively and attend all internal and customer meetings punctually and prepared.
  • Adhere strictly to organizational processes, including accurate and timely completion of timesheets.
  • Communicate promptly, clearly, and responsibly through email, messaging tools, and meetings.
  • Take ownership of commitments and deliverables without requiring repeated follow-ups.


Technical Expectations (Must Have’s):

  • Experience in implementing complex data architecture, data modeling, data design, and persistence (e.g., warehousing, data marts, data lakes).
  • Proficiency in a programming language such as Python, Java, Go, or Scala.
  • Experience with big data cloud technologies like Microsoft Fabric, Databricks, EMR, Athena, Glue, BigQuery, Dataproc, and Dataflow.
  • Ideally, you will have specific strong hands-on experience working with Google Cloud Platform data technologies—Google BigQuery, Google DataFlow, and executing PySpark and SparkSQL code at Dataproc.
  • Solid understanding of Spark (PySpark or SparkSQL), including using the DataFrame Application Programming Interface as well as analyzing and performance tuning Spark queries.
  • Strong experience in data orchestration using Apache Airflow.
  • Develop frameworks and solutions that enable us to acquire, process, monitor, and extract value from large datasets.
  • Highly proficient in SQL.
  • Strong experience in using code repositories such as GitHub and demonstrable GitOps best practices.
  • Bring a good knowledge of popular database and data warehouse technologies and concepts from Google, Amazon, or Microsoft (Cloud & Conventional RDBMS), such as BigQuery, Redshift, Microsoft Azure SQL Data Warehouse, Snowflake, etc.
  • Have knowledge of how to design distributed systems and the trade-offs involved, including working with software engineering best practices for development, networking, source control systems, automated deployment pipelines like Jenkins, and DevOps tools like Terraform.
  • Have strong knowledge of CI/CD tools and frameworks such as Jenkins and GitLab to implement DevOps pipelines.
  • Proficiency in using GenAI tools for productivity e.g. Copilot.


Technical Expectations(Desired):

  • Have strong knowledge of data orchestration solutions like Oozie, Luigi, or Talend.
  • Have strong knowledge of DBT (Data Build Tool) or Dataform.
  • Experience in Snowflake.
  • Experience with Apache Iceberg, Hudi, and query engines like Presto (Trino).
  • Knowledge of data catalogs (AWS Glue, Google DataPlex) and data governance or data quality solutions (e.g., Great Expectations) is an added advantage.
  • Experience in performing DevOps activities such as IaC using Terraform, provisioning infrastructure in GCP/AWS/Azure, defining data security layers, etc.
  • Experience in designing microservice architecture, REST API gateways is a plus.
  • Knowledge of MLOps frameworks and orchestration pipelines such as Kubeflow or TFX is a plus.
  • Certification in GCP, Azure, AWS, Snowflake, Databricks.


What you will receive:

  • Love your career: Competitive total rewards package. Blog during work hours; take a day off and volunteer for your favorite charity.
  • Love your work/life balance: Flexibly work remotely from your home, there’s no daily travel requirement to an office! All you need is a stable internet connection. 
  • Love your coworkers: Collaborate with some of the best and brightest in the industry!
  • Love your development: Hone your skills or learn new ones with our substantial training allowance; participate in professional development days, attend training, become certified, whatever you like! 
  • Love your workspace: We give you all the equipment you need to work from home including a laptop with your choice of OS, and an annual budget to personalize your work environment!  
  • Love yourself: Pythian cares about the health and well-being of our team. You will have an annual wellness budget to make yourself a priority (use it on gym memberships, massages, fitness and more). Additionally, you will receive a generous amount of paid vacation and sick days, as well as a day off to volunteer for your favorite charity.


Hiring Disclaimer

  • The successful applicant will need to fulfill the requirements necessary to obtain a background check.
  • Accommodations are available upon request for candidates taking part in any aspect of the selection process.


AI Disclaimer

Pythian may utilize Enterprise Generative Artificial Intelligence (AI) tools or features throughout its hiring process. These tools help us manage high volumes of applications efficiently and may be employed to review applications, analyze resumes, and assist with other recruitment steps. While Pythian uses AI in its hiring process, it does not substitute for human judgment. Our Talent Acquisition Team reviews all AI-generated recommendations, and the system is subject to regular bias audits to ensure fairness and compliance with all applicable employment and human rights laws. All final hiring decisions are made by, and remain the responsibility of, human decision-makers. By applying for this position, you consent to Pythian’s use of these AI tools in the evaluation of your application. You have the right to request a human review of any solely AI-driven decision or to request an accommodation. Should you require further details regarding the processing of your data, please reach out to us.

Professional Services

Remote (India)

Share on:

Terms of servicePrivacyCookiesPowered by Rippling