Home Job Listings Categories Locations

Senior ML Engineer - GenAI & ML Systems

📍 India

Construction TraceLink

Job Description

Senior ML Engineer – GenAI & Agentic ML Systems

About the Role

We are seeking a highly experienced Senior ML Engineer – GenAI & ML Systems to lead the design, architecture, and implementation of advanced agentic AI systems within our next-generation supply chain platforms.

This role is hands-on and execution-focused. You will design, build, deploy, and maintain large-scale multi-agent systems capable of reasoning, planning, and executing complex workflows in dynamic, non-deterministic environments. You will also own production concerns, including context management, knowledge orchestration, evaluation, observability, and system reliability.

This position is ideal for a strong ML Engineer or Software Engineer with deep practical exposure to GenAI, data science, and modern ML systems, who is comfortable working end-to-end—from architecture through production deployment. Experience in life sciences supply chain or other regulated environments is a strong plus.

Key Responsibilities

- Architect, implement, and operate large-scale agentic AI / GenAI systems that automate and coordinate complex supply chain workflows. - Design and build multi-agent systems, including agent coordination, planning, tool execution, long-term memory, feedback loops, and supervision. - Develop and maintain advanced context and knowledge management systems, including: - RAG and Advanced RAG pipelines - Hybrid retrieval, reranking, grounding, and citation strategies - Context window optimization and long-horizon task reliability - Own the technical strategy for reliability and evaluation of non-deterministic AI systems, including: - Agent evaluation frameworks - Simulation-based testing - Regression testing for probabilistic outputs - Validation of agent decisions and outcomes - Fine-tune and optimize LLMs/SLMs for domain performance, latency, cost efficiency, and task specialization (strong plus). - Design and deploy scalable backend services using Python and Java, ensuring production-grade performance, security, and observability. - Implement AI observability and feedback loops, including agent tracing, prompt/tool auditing, quality metrics, and continuous improvement pipelines. - Apply and experiment with reinforcement learning or iterative improvement techniques within GenAI or agentic workflows where appropriate. - Collaborate closely with product, data science, and domain experts to translate real-world supply chain requirements into intelligent automation solutions. - Guide system architecture across distributed services, event-driven systems, and real-time data pipelines using cloud-native patterns. - Mentor engineers, influence technical direction, and establish best practices for agentic AI and ML systems across teams.

Required Qualifications

- 6+ years of experience building and operating cloud-native SaaS systems on AWS, GCP, or Azure (minimum 5 years with AWS). - Strong ML Engineer / Software Engineer background with deep practical exposure to data science and GenAI systems. - Expert-level, hands-on experience designing, deploying, and maintaining large multi-agent systems in production. - Proven experience with advanced RAG and context management, including memory, state handling, tool grounding, and long-running workflows. - 6+ years of hands-on Python experience delivering production-grade systems. - Practical experience evaluating, monitoring, and improving non-deterministic AI behavior in real-world deployments. - Hands-on experience with agent frameworks such as LangGraph, AutoGen, CrewAI, Semantic Kernel, or equivalent. - Solid understanding of distributed systems, microservices, and production reliability best practices.

Big Plus / Preferred Qualifications

- Hands-on experience fine-tuning LLMs or SLMs for domain-specific tasks (training, evaluation, deployment). - Experience designing and deploying agentic systems in supply chain domains (logistics, manufacturing, planning, procurement). - Strong knowledge of knowledge organization techniques, including RAG, Advanced RAG, hybrid search, and reranking. - Experience applying reinforcement learning, reward modeling, or iterative optimization in GenAI workflows. - Familiarity with Java and JavaScript/ECMAScript. - Experience deploying AI solutions in regulated or enterprise environments with governance, security, and compliance requirements. - Knowledge of life sciences supply chain or regulated industry ecosystems.

Who You Are

- A hands-on technical leader who moves seamlessly between architecture and implementation. - A builder who values practical, production-ready solutions over prototypes. - Comfortable designing systems with probabilistic and emergent behavior. - Passionate about building GenAI systems that are reliable, observable, explainable, and scalable. - A clear communicator who can align stakeholders and drive execution across teams.

Ready to Apply?

Don't miss this opportunity! Apply now and join our team.

Job Details

Posted Date: February 26, 2026
Job Type: Construction
Location: India
Company: TraceLink

Ready to Apply?

Don't miss this opportunity! Apply now and join our team.