Job Description
Responsive is seeking a Machine Learning Engineer with hands-on experience in modern NLP systems, large language models (LLMs), and applied machine learning. This role focuses on building, fine-tuning, evaluating, and deploying ML and LLM-based solutions that operate at scale and directly impact customer-facing products.
The ideal candidate combines strong Python engineering skills with practical experience in LLMs, embeddings, retrieval systems, and model evaluation, and is comfortable working with both structured and unstructured data in production environments.
Key Responsibilities
Machine Learning & NLP Development
Design, develop, and deploy machine learning and NLP models to enhance Responsiveโs AI-powered products.
Work extensively with structured and unstructured datasets, including text-heavy enterprise content such as RFPs, RFIs, and questionnaires.
Develop and optimize NLP pipelines for tasks such as information extraction, classification, summarization, semantic search, and question answering.
Large Language Models (LLMs)
Work with proprietary and open-source LLMs (e.g., OpenAI, Llama, Mistral or similar) for real-world use cases.
Fine-tune pretrained language models using full fine-tuning and parameter-efficient methods (e.g., LoRA, QLoRA, adapters, prefix/prompt tuning) based on dataset size, cost, latency, and deployment constraints.
Design and implement strategies to mitigate bias, overfitting, and catastrophic forgetting during fine-tuning.
Evaluate and compare fine-tuned models against baselines such as frozen backbones with linear probes.
Retrieval & Embeddings
Build and maintain embedding-based systems for semantic search, clustering, and similarity matching.
Design and implement Retrieval-Augmented Generation (RAG) architectures, including document chunking, vector indexing, retrieval strategies, and prompt orchestration.
Optimize retrieval quality using appropriate metrics and ablation studies.
Evaluation & Reliability
Define rigorous evaluation plans for ML and LLM systems, including dataset splits, offline metrics, human-in-the-loop evaluation, and production monitoring.
Continuously monitor model performance, data drift, and failure modes in production.
Diagnose LLM performance issues (e.g., hallucinations, poor grounding, latency) and apply corrective strategies such as prompt engineering, retrieval improvements, or alternative modeling approaches.
Engineering & Collaboration
Develop and maintain a production-grade ML codebase using Python, Git, and Linux-based environments.
Collaborate closely with product, engineering, and data teams to translate business problems into scalable ML solutions.
Contribute to architecture discussions around model hosting (self-hosted vs. cloud-managed), hardware requirements, and inference optimization.
Education
Bachelorโs degree in a quantitative discipline such as Engineering, Computer Science, Information Technology, Statistics, or a related field.
Experience
3โ5 years of hands-on experience in Machine Learning or Applied NLP.
Demonstrated experience building and deploying ML or NLP models in production environments.
Knowledge, Skills & Abilities
Required
Strong proficiency in Python for ML and data processing.
Solid understanding of core machine learning and deep learning concepts and algorithms.
Hands-on experience with NLP techniques and modern language models.
Experience using embeddings for NLP tasks such as semantic search or clustering.
Familiarity with Linux environments and Git-based version control.
Strong problem-solving, analytical, and debugging skills.
Good foundation in mathematics, including linear algebra, probability, and optimization.
Preferred / Nice to Have
Experience fine-tuning open-source LLMs and deploying them in production.
Experience designing and operating RAG-based systems.
Familiarity with LLM function calling / tool invocation patterns.
Experience evaluating LLMs using both automated metrics and human feedback.
Exposure to model hosting considerations, including GPU/CPU trade-offs and inference optimization.