Job Description
Company:
SuMeera Solutions (CompliEase – AI-powered Compliance SaaS)
Location:
Remote (India) | Team base in Chhattisgarh & across India
Employment:
Full-time | ESOPs available
About SuMeera Solutions
We're building
CompliEase , an AI-driven SaaS that converts complex regulations into structured rules, APIs, and dashboards. You'll work on a modern, multi-tenant platform powering real-time compliance checks, automated data extraction from regulatory sources, and intelligent document processing using AWS Bedrock.
Role Overview
We're seeking a
Backend Data Engineer
with strong Python skills to build our data ingestion, processing, and AI/ML pipelines. You'll work on extracting regulatory data from diverse sources (CSV/XLS, PDFs, web scraping), orchestrating ETL workflows, leveraging AWS Bedrock for intelligent document processing, and building APIs to serve processed data. Light front-end work using FastAPI or Node.js is also expected.
What You'll Do
Data Engineering & ETL
Build Python pipelines to ingest data from CSV/XLS files and load into PostgreSQL with proper validation, transformations, and error handling.
Design and implement web scraping solutions to extract regulatory data from government and industry websites (BeautifulSoup, Scrapy, or similar).
Download and process OSHA compliance PDFs, store in AWS S3, and manage document versioning and metadata.
Develop ETL workflows with proper logging, monitoring, and retry mechanisms.
AI/ML & Document Processing
Integrate AWS Bedrock LLMs to extract structured compliance data from PDF documents (regulations, standards, guidelines).
Build prompt engineering workflows to parse unstructured regulatory text into structured database schemas.
Implement validation and quality checks on AI-extracted data before loading into PostgreSQL.
Optimize LLM calls for cost and performance (batching, caching, context management).
Database & API Development
Design PostgreSQL schemas for compliance rules, regulatory mappings, and extracted document data.
Write efficient SQL queries, manage indexes, and optimize database performance.
Build REST APIs using FastAPI or Node.js to expose processed compliance data.
Implement basic front-end interfaces for data review, validation, and admin workflows.
AWS & Infrastructure
Work with AWS services: S3 (document storage), Bedrock (LLM integration), RDS/PostgreSQL, Lambda (optional), and CloudWatch (monitoring).
Manage secrets using AWS Secrets Manager or Parameter Store.
Deploy and maintain data pipelines in production environments.
Must-Have Skills
2–4 years
of professional experience in backend development and data engineering.
Strong Python skills : pandas, file I/O, data validation, error handling, async programming.
Web scraping : BeautifulSoup, Scrapy, Selenium, or similar; handling dynamic content, pagination, rate limiting.
PostgreSQL : schema design, SQL queries, indexes, transactions, database optimization.
AWS experience : S3, Bedrock (or similar AI/ML services), RDS, basic IAM and security.
ETL/data pipelines : building reliable data ingestion and transformation workflows.
API development : FastAPI or Node.js/Express for building RESTful services.
Git , Docker basics, and familiarity with CI/CD concepts.
Good-to-Have
Experience with
AWS Bedrock, SageMaker, or other LLM platforms .
PDF processing libraries (PyPDF2, pdfplumber, Textract).
Front-end basics: React, HTML/CSS, JavaScript/TypeScript for simple admin UIs.
Workflow orchestration tools (Airflow, Prefect, Step Functions).
Experience with OCR, document classification, or NLP pipelines.
Understanding of regulatory compliance domains (OSHA, EPA, safety standards).
Testing frameworks (pytest), data quality validation, and monitoring/observability.
Work Setup & Timing
Remote within India
(stable internet & professional workspace required).
Flexible hours
with
2–3 hrs overlap with US Central Time
for stand-ups/reviews
(typically 7:30–10:30 AM IST during US daylight savings or 8:30–11:30 AM IST standard time).
What We Offer
Compensation : Up to ₹5 Lakh per annum (based on experience) + ESOPs
High ownership in building critical data infrastructure and AI/ML capabilities
Work on cutting-edge AI compliance automation with real business impact
Mission-driven product serving enterprises across multiple industries
Education
Bachelor's in CS/IT/Data Science or equivalent practical experience.
How to Apply
Email your resume (and GitHub/portfolio if available) to
Contact@SuMeeraSolutions.com , and also
apply via LinkedIn by answering all the mandatory questions .