Job Description
Job Role- AI Engineer (Data Pipelines & RAG)
Job Type- Full-time
Work Mode- Remote(6 days working)
We are looking for a hands-on AI/Data Engineer (4–7 years) to build and scale data pipelines powering GenAI and agentic applications. You’ll architect data models, build ETL/ELT workflows, and integrate pipelines with RAG-based systems in a fast-paced startup environment.
What You’ll Do
Data Pipelines & Modeling
- Build scalable ETL/ELT pipelines (batch & streaming) using Python + Spark
- Automate ingestion from databases, APIs, files, SharePoint and other document sources
- Process & structure unstructured files (PDFs, tables, charts, drawings, etc.)
- Own chunking, indexing & embedding strategies for RAG/LLM use cases
- Design logical & physical data models, schema mappings & data dictionaries
GenAI & RAG Integration
- Feed real-time data into LLM prompts
- Build retrieval workflows for downstream agent/RAG systems in RE/Construction
Observability & Governance
- Implement monitoring, alerting & logging for pipeline reliability
- Apply IAM, Unity Catalog and other data privacy/security controls
CI/CD & Automation
- Use DevOps workflows with GitHub Actions/Azure DevOps/CircleCI
- Build reproducible infra using Terraform / ARM templates
- Use Prefect / Airflow for orchestration
What You’ll Need
- 5+ years in data engineering, with 1–2 years working on pipelines for unstructured data & RAG systems
- Strong Python + SQL, experience with dlt, DuckDB, DVC
- Azure cloud experience in production
- Experience with chunking/indexing strategies for RAG
- Strong Git/CI/CD workflows
- Familiarity with Prefect or equivalent
- Good to Have: MLflow, Docker/K8s, Computer Vision, agentic AI concepts, governance & privacy frameworks (GDPR)
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.
Job Details
Posted Date:
November 28, 2025
Job Type:
Technology
Location:
India
Company:
BeGig
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.