Job Description
Hiring: Test Engineer – AI-Augmented Test Engineer
Sengol.ai – India
Send your profile to careers@sengol.ai
Compensation: Equity (based on experience, TBD)
Experience : 5-6 years is must
Location: Remote/Hybrid – Bengaluru, India
Type: Founding Team | Full-Time | Part-Time (conflict-free, highly committed professionals)
Do not apply if you do not have experience working with AI-Assisted coding.
About the Role
Sengol.ai is building an AI-native Risk intelligence platform. As a Founding Engineer, you will work directly with the founders to test and certify the quality of the product.
We are seeking engineers who embrace AI-augmented testing, iterate quickly, and are ready to tackle complex, evolving challenges using AI-assisted coding workflows.
Interviews will focus on your ability to work with AI-driven development environments (e.g., Claude, Cursor), Test approach and methods indepth, and testing end-to-end integration.
You will work closely with AI researchers, product managers, and software engineers to continuously refine model behavior, enhance evaluation frameworks, and contribute to safer and more reliable AI-driven products.
Key Responsibilities
- Evaluate LLM and agentic AI outputs for correctness, coherence, safety, relevancy, and adherence to product requirements.
- Design and run test plans, including manual evaluations, scripted tests, and automated QA workflows for AI output quality.
- Create and refine evaluation rubrics for model accuracy, hallucination detection, robustness, bias, and user experience.
- Identify issues, patterns, and model failure modes, and clearly document them for engineering and research teams.
- Develop synthetic test datasets and scenarios covering edge cases, adversarial prompts, and real-world user queries.
- Collaborate with model and product teams to iterate on improvements, define acceptance criteria, and validate model updates.
- Run regression tests on model versions to ensure expected improvements and detect any performance degradation.
- Contribute to tooling and automation, including prompt evaluation scripts, output comparison tools, and internal dashboards.
- Verify agentic workflows and tool-using AI behavior, ensuring correct reasoning, action calls, tool usage, and task sequencing.
- Participate in safety testing, including red-teaming tasks to probe for unsafe or unintended AI behaviors.
Required Qualifications
- Bachelor’s degree in Computer Science, Engineering, Mathematics, or related field—or equivalent practical experience.
- Experience with testing complex software systems, preferably involving ML or automation.
- Strong analytical and problem-solving skills; ability to identify subtle issues in AI outputs.
- Familiarity with Python and ability to write scripts for data manipulation, test automation, or evaluation tasks.
- Understanding of AI/ML basics, LLM behavior, prompt engineering concepts, and evaluation methodologies.
- Excellent documentation skills and ability to communicate findings clearly to technical teams.
Preferred Qualifications
- Experience testing or evaluating LLMs, generative AI, or agent-based systems.
- Knowledge of automated evaluation frameworks, prompt-based testing, or RLHF-style methodologies.
- Background in safety evaluation, bias detection, or trust & safety for AI systems.
- Experience with API-based tools, reasoning agents, or tool-calling AI architectures.
- Ability to think like an adversary and design robust edge-case scenarios.