Job Description
(RAG & Agent Systems) — MERN + Python
Experience Required: 2-3 years
Location: Work from office: Ahmedabad
Salary range – Upto Rs 100k per month.
Positions: 01
About us:
Saakh is an AI First B2B Fintech startup enabling SMEs with business intelligence.
1) Take data driven lending decisions
2) Automate payment collections
3) Prevent late payments
4) Delegate recoveries of stuck payments
5) Build Digital Reputation Identity of businesses
About the Role
We’re hiring an
AI Engineer
who can design, build, and ship
production‑grade Retrieval‑Augmented Generation (RAG) systems and AI Agents
end‑to‑end. You will work across the stack—
Python
for data/ML services and
MERN (MongoDB, Express.js, React, Node.js)
for the product surface—to deliver secure, observable, and scalable intelligent features. A key part of the role is building and operating
MCP servers
(Model Context Protocol) and
tooling connectors
that expose internal data and actions safely to agents.
You’re hands‑on with vector databases, LLM orchestration, evaluation, and monitoring. You write clean APIs, ship thoughtful UIs, and automate deployment with CI/CD.
What You’ll Do
Design & implement RAG pipelines
: ingestion/ETL, chunking & metadata, embeddings, hybrid search (keyword + vector), reranking, context caching, freshness & re‑indexing.
Build multi‑step AI agents
: tool‑use, function calling, planning/state machines (e.g., LangGraph/semantic‑kernel/AutoGen equivalents), error recovery, memory design.
Develop MCP servers & connectors
: define tools/schemas, auth, rate‑limits, multi‑tenant isolation, and logging to safely expose internal systems to agents.
Own MERN product surfaces
: React frontends (chat/task UIs), Node/Express APIs, WebSockets/Streaming for token‑level updates, and MongoDB persistence.
Engineer Python microservices
: retrieval, orchestration, evaluation, and batch jobs; package as containers; expose fast, typed APIs.
Vector DB operations
: design indexes, choose distance metrics/ANN algorithms (e.g., HNSW/IVF), tune recall/latency; manage Pinecone/Weaviate/Qdrant/Milvus/pgvector.
LLMOps, evaluation & observability
: establish offline/online evals (RAGAS/TruLens/DeepEval), guardrails, tracing (OpenTelemetry/Langfuse), metrics, and A/B tests.
Security & governance
: prompt‑injection defenses, PII redaction, data scoping/RBAC, audit logs, rate limiting, content filters, policy‑as‑code.
Performance & cost
: caching, batching, streaming, model selection, autoscaling, token/latency budgets, and cost attribution.
Work cross‑functionally
with Product, Data, and Infra to prioritize use cases and ship reliable, testable features on a predictable cadence.
Required Qualifications
2–3 years
software engineering (or equivalent depth), with
production
experience in both:
Python
(APIs, data/ML services, packaging, testing with Pytest)
MERN
: MongoDB, Express.js, React, Node.js (TypeScript preferred)
RAG & Agents
shipped to production: retrieval pipelines, embeddings, hybrid search, reranking, function/tool calling, and multi‑step workflows.
Vector databases
: one or more of
Pinecone, Weaviate, Qdrant, Milvus, pgvector
—schema design, index tuning, and ops.
MCP servers
: built/maintained
Model Context Protocol
servers or equivalent agent‑tool bridges; experience defining tools, auth, isolation, and telemetry.
Cloud & DevOps
: Docker, Kubernetes or serverless (Cloud Run/Lambda), CI/CD (GitHub Actions/GitLab), infrastructure as code (Terraform/Pulumi).
Testing & quality
: unit/integration tests, load testing, contract tests for tool/agent interfaces, data quality checks for corpora.
Security mindset
: data governance, secrets management, least privilege, dependency hygiene.
Tech Stack You’ll Touch
Backend:
Python (FastAPI), Node.js/TypeScript (Express/Nest)
Frontend:
React (Vite/Next.js), WebSockets/Server‑Sent Events
Data/RAG:
MongoDB, Postgres, S3/GCS; vector DBs (Pinecone/Weaviate/Qdrant/Milvus/pgvector)
Agents/Orchestration:
LangChain/LangGraph, semantic‑kernel, custom state machines; MCP servers & tools
Infra:
Docker, Kubernetes/Cloud Run, Terraform, GitHub Actions; Redis for cache/queues
Observability & Eval:
OpenTelemetry, Langfuse, RAGAS/TruLens/DeepEval, Prometheus/Grafana
Are you one of us?
Mail your resume at ag@saakh.in
Visit us:
Website –