Job Description
Engagement:
Hourly contract (independent contractor)
Location:
Remote
Role Overview
We are seeking a hands-on Quality Engineering Lead to define, own, and execute the end-to-end quality strategy for full-stack and LLM-powered products. This role blends technical depth with leadership, ensuring robust automation, release readiness, and measurable quality standards across distributed systems.
You will drive automation frameworks, evaluation pipelines, and CI/CD quality gates while mentoring QA engineers and partnering cross-functionally with Engineering, Product, and Research teams.
Key Responsibilities
Own quality strategy for complex, distributed systems; define quality metrics such as SLOs, defect escape rate, DORA metrics, and coverage benchmarks
Design and implement automation frameworks and evaluation pipelines using Python and Node.js
Drive testing across all layers: E2E, frontend/UI, backend/API, and performance (load and stress testing)
Build and operate quality systems for LLM-powered products, including:
AI agents and multi-step agent workflows
MCP tools, servers, and integrations
LLM-as-a-Judge automated evaluation harnesses
Define golden datasets, regression prompts, judge calibration strategies, and validation processes for probabilistic/non-deterministic behavior
Integrate automated quality systems into CI/CD pipelines with release gates, orchestration, and dashboards
Own release readiness processes and quality sign-off
Lead root-cause analysis, defect triage, and continuous improvement initiatives
Mentor QA engineers and SDETs; establish clear ownership boundaries and quality standards
Collaborate closely with Engineering, Product, and Research stakeholders
Required Qualifications
7+ years of experience in Quality Engineering, SDET, or Software Engineering roles
2–3+ years leading quality initiatives or owning quality strategy for complex systems
Strong development skills in Python and Node.js for automation and evaluation tooling
Proven experience defining end-to-end quality strategy for full-stack applications
Experience testing across E2E, UI, backend/API, and performance layers
2+ years building automated evaluation systems for LLM-powered products, including agents, workflows, MCP components, and LLM-as-a-Judge pipelines
Experience working with golden datasets, regression prompts, and judge calibration
Experience integrating QA systems into CI/CD pipelines with release gates
Experience mentoring QA/SDET teams
Excellent written and verbal communication skills in English
Contractor engagement (no medical or paid leave)
Minimum 20 hours per week, at least 4 hours per day
4 hours of PST overlap required
Duration: 3 months (extendable based on performance and engagement)
#J-18808-Ljbffr
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.
Job Details
Posted Date:
February 24, 2026
Job Type:
Construction
Location:
Indonesia
Company:
Keystone Recruitment
Ready to Apply?
Don't miss this opportunity! Apply now and join our team.