Job Description
About Position:
We are seeking experience Data Engineer responsible for building, optimizing, and operating reliable data pipelines using StreamSets, Azure Data Factory, and SQL across batch and near real-time workloads.
- Role: Data Engineer
- Location: Pune
- Experience: 5 to 7 years
- Job Type: Full Time Employment
What You'll Do:
- Design, build, and maintain scalable data ingestion and transformation pipelines using StreamSets (DC/TF) and Azure Data Factory (pipelines, datasets, linked services, triggers).
- Implement robust ETL/ELT logic using SQL, including window functions, CTEs, query optimization, and indexing/partitioning strategies.
- Develop reusable pipeline components with strong error handling, checkpointing, monitoring, and alerting to ensure reliability and observability.
- Integrate data pipelines with key Azure services such as ADLS Gen2, Azure SQL/Managed Instance, Key Vault, and Azure DevOps.
- Ensure data quality and governance through validations, deduplication, schema evolution handling, and enforcement of data standards.
- Implement security best practices, including RBAC, secrets management, and PII handling.
- Optimize cost and performance through pipeline concurrency, parallelization, mapping data flows, and efficient scheduling.
- Collaborate with Analytics/BI and Product teams to onboard new data sources, define SLAs, and operationalize data solutions.
- Create and maintain documentation for data flows, lineage, operational runbooks, and troubleshooting guides.
Expertise You'll Bring:
- 4 to 6 years of hands-on experience in data engineering, building production-grade pipelines using StreamSets and Azure Data Factory.
- Strong SQL expertise across at least one major RDBMS (Azure SQL, SQL Server, PostgreSQL, etc.).
- Solid understanding of data warehousing concepts, including staging layers, dimensional modeling, and Slowly Changing Dimensions (SCD).
- Practical experience with the Azure ecosystem, including ADLS Gen2, Key Vault, and CI/CD using Azure DevOps.
- Proven skills in pipeline monitoring, alerting, and troubleshooting in production environments.
- Exposure to Delta Lake, Databricks, or PySpark.
- Familiarity with data quality frameworks (e.g., Great Expectations) or data lineage/governance tools (e.g., Azure Purview).
- Basic scripting experience in Python or PowerShell.
- Experience working with streaming or messaging platforms such as Kafka or Azure Event Hub.
Benefits:
- Competitive salary and benefits package
- Culture focused on talent development with quarterly growth opportunities and company-sponsored higher education and certifications
- Opportunity to work with cutting-edge technologies
- Employee engagement initiatives such as project parties, flexible work hours, and Long Service awards
- Annual health check-ups
- Insurance coverage: group term life, personal accident, and Mediclaim hospitalization for self, spouse, two children, and parents
Values-Driven, People-Centric & Inclusive Work Environment:
Persistent is dedicated to fostering diversity and inclusion in the workplace. We invite applications from all qualified individuals, including those with disabilities, and regardless of gender or gender preference. We welcome diverse candidates from all backgrounds.
- We support hybrid work and flexible hours to fit diverse lifestyles.
- Our office is accessibility-friendly, with ergonomic setups and assistive technologies to support employees with physical disabilities.
- If you are a person with disabilities and have specific requirements, please inform us during the application process or at any time during your employment
Let's unleash your full potential at Persistent
'Persistent is an Equal Opportunity Employer and prohibits discrimination and harassment of any kind."