Job Description
Senior Data Engineer (Databricks & Cloud Data Platforms)
About the role
In this role, you will play a key role in designing and implementing next-generation data platforms. You will move beyond traditional ETL by focusing on the Semantic Layer, using Python to define complex business logic, metrics, and data models as code. You will bridge the gap between raw data and business consumption, ensuring consistent, governed data delivery via Databricks.
Your responsibilities
Semantic Layer & Data Modeling (Python-first)
Design and implement the Data/Semantic Layer using Python frameworks to define business metrics, dimensions, and KPIs as code.
Build the \"Gold\" or \"Serving\" layer in Databricks using PySpark, embedding complex business logic directly into the data pipeline rather than downstream BI tools.
Implement Metrics-as-Code practices to ensure consistent definitions across different analytical tools (Power BI, custom apps, data science models).
Translate functional business requirements into Python-based transformation logic for the final serving layer.
Data Pipelines & Databricks
Design, build, and maintain scalable batch and streaming pipelines using Databricks Workflows and Apache Spark.
Develop robust ETL/ELT processes to ingest data from diverse sources (IoT, ERPs, APIs) and move it through the Medallion Architecture (Bronze $\to$ Silver $\to$ Gold).
Optimize Python/Spark code for performance, partition management, and cost efficiency.
Architecture & Governance
Contribute to the design of Lakehouse architectures, ensuring the semantic layer supports both self-service BI and advanced analytics.
Implement data governance and quality checks within the Python pipeline code.
Ensure the semantic layer aligns with data security standards (Row-Level Security, masking) and access controls.
Collaboration
Work closely with Business Analysts to understand metric definitions and codify them in Python.
Collaborate with Data Scientists to expose semantic features for Machine Learning models.
Mentor junior engineers for data modeling.
Your background
Bachelor’s or master’s degree in computer science, Engineering, or related field.
Minimum 5 years of professional experience in Data Engineering.
Strong expertise in Python for data manipulation and modeling (pandas, PySpark).
Proven experience building Semantic/Serving layers programmatically (not just drag-and-drop BI modeling).
Experience with Databricks and the Delta Lake ecosystem.
Familiarity with \"Data-as-Code\" or \"Metrics-as-Code\" concepts (e.g., using dbt with Python models, or custom Python semantic frameworks).
Strong understanding of dimensional modeling (Star Schema) and how to implement it via Spark/Python.
Technical skills
Languages: Python (Advanced), SQL (Strong).
Core Platform: Azure Databricks, Apache Spark (PySpark), Delta Lake.
Semantic/Modeling: Building serving layers in PySpark, dbt (Python models), or Python-based metric layers.
Data Architecture: Medallion Architecture (Bronze/Silver/Gold), Dimensional Modeling.
DevOps: CI/CD for data pipelines (Azure DevOps/GitHub Actions), Git flow.
Concepts: ACID transactions, Time Travel, Unity Catalog, Governance.
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.
Job Details
Posted Date:
February 19, 2026
Job Type:
Technology
Location:
India
Company:
Adastra
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.