Job Description
Job Description – Senior Data Engineer
Experience Required
5–7 years
Employment Type
Full-time / Contract (as applicable)
Location
Bangalore / Hybrid / Onsite (as applicable)
Role Overview
The Senior Data Engineer will be responsible for stabilizing, standardizing, and optimizing an AWS-based enterprise data platform. The role focuses on data ingestion, ETL reliability, platform performance, security, and governance, ensuring scalable and high-quality data pipelines that support analytics, dashboards, and future Digital Twin and advanced analytics initiatives.
Key Responsibilities
1. Data Platform Stabilization & Discovery
Audit and rationalize AWS S3 buckets, including data classification, usage analysis, and lifecycle management
Analyze and document API ingestion pipelines and schemas supporting upstream integrations
Review and enhance AWS Glue ETL jobs with standardized error handling, logging, retries, and version control Reactivate, configure, and manage development and staging environments to enable safe deployments and testing
2. Data Architecture & Engineering
Design, implement, and document Amazon Redshift schemas aligned to the Bronze–Silver–Gold data architecture
Establish and maintain data lineage and dependency mapping across ingestion, transformation, and consumption layers
Build and maintain scalable, modular ETL pipelines supporting batch and near-real-time processing
3. Performance Optimization & Scalability
Optimize S3 → Athena → Redshift pipelines using partitioning strategies, file formats, compression, and caching
Implement and tune Redshift performance features, including distribution styles, sort keys, and materialized views
Monitor and improve pipeline throughput, query performance, and resource utilization
4. Data Quality, Monitoring & Reliability
Implement data quality validation layers in collaboration with analytics and data science teams
Configure AWS CloudWatch monitoring and alerting for Glue jobs, Redshift clusters, and data pipeline failures
Ensure high availability, fault tolerance, and operational reliability of data workflows
5. Security, Governance & Compliance
Design and implement Role-Based Access Control (RBAC) across AWS services and data assets
Establish centralized access logging, auditing, and periodic access reviews
Define and implement backup, recovery, and data retention strategies
Support data ownership, stewardship, and governance frameworks
6. Documentation & Enablement
Maintain comprehensive technical documentation, including ETL workflows, schemas, lineage, and operational runbooks
Contribute to enterprise data cataloging and metadata management using AWS Glue Data Catalog and Confluence
Support training and knowledge-transfer sessions for internal engineering and analytics teams
Required Skills & Qualifications
Technical Skills
Strong experience with AWS data services, including:
Amazon S3
AWS Glue
Amazon Athena
Amazon Redshift
AWS CloudWatch
Advanced proficiency in SQL and data warehouse performance tuning
Strong Python scripting experience for data analysis, validation, anomaly detection, and automation of analytical workflows
Hands-on experience designing and maintaining ETL / ELT pipelines
Strong understanding of data modeling, partitioning, and data warehousing best practices
Engineering & Platform Skills
Experience implementing logging, monitoring, and alerting frameworks
Familiarity with version control, CI/CD, and deployment best practices
Ability to troubleshoot complex data pipeline and performance issues
Professional & Collaboration Skills
Strong ownership and accountability mindset
Ability to collaborate effectively with Data Scientists, Analysts, and Business stakeholders
Clear communication and strong technical documentation skills
Preferred / Nice to Have
Exposure to Digital Twin, IoT, or large-scale analytics platforms
Experience in data platform stabilization, modernization, or cloud migration programs
Background in consulting, SI, or enterprise data delivery environments