Why I Built This Playbook
Quality engineering teams are under pressure from two directions at once.
First, they still need the basics:
- requirements mapped to tests
- reliable UI and API automation
- change-aware regression scope
- useful release-quality reporting
Second, they are now expected to adopt AI in a way that is practical, explainable, and actually maintainable.
That is why I built the AI Quality Engineering Playbook.
The goal was not to create a one-click “AI test generator.” The goal was to create a quality engineering system that can evolve in stages:
- start with deterministic value
- add live integrations
- add retrieval and embeddings
- add AI-assisted quality workflows
- move toward a true AI-native QE platform
Repo:
- AI Quality Engineering Playbook: github.com/amiya-pattnaik/ai-quality-engineering-playbook
The V1 Principle: Be Useful Before Being Intelligent
The first version was intentionally deterministic.
V1 focuses on:
- Jira-style requirement input from scenario JSON
- local Gherkin ingestion
- functional test case generation before automation
- Playwright or WebdriverIO UI generation
- Playwright or lightweight REST API generation
- traceability, impacted scope, and execution reporting
That decision matters.
Many AI testing efforts begin with orchestration, prompts, agents, and retrieval before teams even have a stable artifact flow. In practice, that creates complexity before trust.
V1 avoids that trap. It gives a team something concrete:
- a runnable CLI
- generated test assets
- report outputs
- a known contract
- a stable baseline for future evolution
The V2 Principle: Add AI in Layers
Once the deterministic path was stable, the next step was to introduce the architecture that a real AI-native quality platform needs.
The V2 branch now includes:
- live-capable Jira, GitHub, and SonarQube connector scaffolds
- environment-driven configuration
- retrieval indexing and context selection
- hybrid retrieval using keyword scoring plus embeddings
- embedding providers for local, OpenAI, and Ollama paths
- AI assist mode for functional test expansion
- an agent workflow shell for requirement, change, and quality analysis
This is the key idea:
the V2 branch does not replace V1. It extends it.
Why This Approach Is More Practical
Most teams do not need “AI for everything” on day one.
They need a path that answers questions like:
- what should be tested from the requirement?
- what can be generated safely?
- what changed in the code?
- what tests are impacted?
- what quality signals matter before release?
That is why the playbook is structured as an engineering progression instead of a demo script.
Requirements + Specs
|
+--> Deterministic functional test generation
|
+--> Automation generation
|
+--> Change-aware impact analysis
|
+--> Retrieval and embeddings
|
+--> AI-assisted expansion and decision support
Retrieval and Embeddings: Local, Ollama, and OpenAI
One design choice I cared about was flexibility.
The retrieval layer now supports:
local- fully offline
- deterministic local vectors
- no external dependency
ollama- local-runtime embeddings
- offline-capable after local setup
- useful for privacy-conscious or self-hosted workflows
openai- provider-backed embeddings
- useful when teams want a managed cloud path
That gives the playbook a practical operating range:
- local experiments
- enterprise-friendly private runtime
- managed cloud retrieval
Why This Matters for Quality Engineering
I do not think the future of testing is “generate scripts from prompts.”
I think the future is a connected quality system that can:
- understand requirements
- preserve traceability
- reason about changes
- prioritize coverage
- expand test ideas safely
- support release decisions with evidence
That is the real shift from test generation to AI-native quality engineering.
What Still Comes Next
The current V2 branch is a foundation, not the end state.
Still ahead:
- external vector database integration
- deeper RAG over larger project artifacts
- stronger live provider-backed LLM generation
- richer GitHub change analysis
- fuller Restlyn-style API capabilities
- true multi-agent QE execution flows
Closing Thought
The biggest lesson from building this playbook is simple:
AI in quality engineering should be introduced as an engineering system, not as a disconnected experiment.
That means:
- stable inputs
- deterministic fallbacks
- clear retrieval behavior
- traceable outputs
- incremental architecture
Once those exist, GenAI, RAG, and agentic workflows become useful in a way that teams can actually adopt.