What Is Agentic AI (and Why It Helps Engineering)

Agentic AI is about orchestrating multiple specialized agents that reason over your real signals and pass context along, instead of one-off prompts. Benefits for engineering teams:

  • Context continuity: Each step (metrics, discovery, engineering, quality, platform) inherits prior findings, so outputs stay grounded.
  • Signal-driven: Agents consume repo signals, metrics, and external links (CI/Sonar/Fortify) to avoid hallucinated guidance.
  • Actionable artifacts: Plans, guardrails, and tests are generated as Markdown/HTML plus executable Playwright specs—things you can drop into CI.
  • Modular and extensible: Swap models (mock/OpenAI), add tools (log readers, CI APIs), or new agents without rewriting the flow.

How Agentic AI Works

The core mechanism of an agentic system typically involves several key components:

  • Goal Definition: The user or system provides a high-level objective (e.g., “Resolve the user bug report”).
  • Planning/Reasoning: Using a Large Language Model (LLM) or similar reasoning engine, the agent breaks the goal down into a series of smaller, manageable steps.
  • Action Execution: The agent uses tools, APIs, or code execution environments to interact with the external environment (e.g., searching documentation, writing code, running tests).
  • Observation/Feedback: The agent observes the results of its actions and compares them against the goal state.
  • Reflection/Iteration: If the goal isn’t met, the agent reflects on its observations and revises its plan, continuing the cycle until completion or failure. This iterative process allows the AI to self-correct and handle unforeseen circumstances, making it more robust and capable than non-agentic AI in complex, real-world tasks.

Why Agentic AI Here

We built a small Agentic workflow to show how Agentic AI can assist software engineering, quality engineering, and platform engineering teams end-to-end. Instead of a single prompt, a chain of agents passes context forward: metrics → discovery → engineering → quality → platform → test design → summary. The demo runs locally with a mock model or with OpenAI if you drop in a key, and it ships reports plus auto-generated Playwright tests.

What’s in the Playbook

  • Scenario-first: A JSON scenario defines the problem, stack, constraints, and signals (repo links, CI/Sonar/Fortify, etc.).
  • Agent chain: Each agent consumes prior outputs and emits its slice (risks, plans, guardrails, tests).
  • Signals-aware: Metrics and external signals are injected so the agents reason over real inputs, not generic lore.
  • Test generation: Optional --run-tests auto-generates REST API / UI (Playwright) specs and runs them; results land in the report.
  • Offline-friendly: Mock model by default; flip to OpenAI via config/model.json or OPENAI_API_KEY.

How the Flow Runs

  1. Load scenario + signals: scenarios/banking-app.json sets goal/tech/constraints; optional metrics (data/metrics.json) and links (config/signals.json) are attached.
  2. Chain agents: Metrics → Discovery → Engineering → Quality → Platform → TestDesigner → Summary. Each step gets prior outputs plus signals.
  3. Model calls: Mock responses by default; OpenAI if configured.
  4. Render: Markdown (and optional HTML) report in reports/, with metrics status, plans, risks, and generated tests.
  5. (Optional) Tests: --run-tests generates Playwright specs in tests/generated/ and runs them; results are stitched into the report.

Demo Web App

  • Static banking UI in web/; run npm run web then open http://localhost:3000 (login creds on the card).
  • Use it as context when running the agent chain; with --run-tests, the Playwright specs hit the UI (login/dashboard) and the /api/account endpoint.
  • Repo: github.com/amiya-pattnaik/agentic-engineering-playbook

Why This Matters for Teams

  • Product-aligned outputs: Plans and risks stay anchored to your metrics and signals, not generic advice.
  • Guardrails baked in: Platform and quality agents add policies (CICD, security, budgets) alongside engineering steps.
  • Automation-friendly: Tests and reports are artifacts you can drop into CI; mock mode keeps it offline for demos.
  • Extensible: Add scenarios for your services, wire real tools (logs, CI APIs, SLOs), and swap the model client without changing the flow.

Quickstart (clone + run)

# clone and install
git clone https://github.com/amiya-pattnaik/agentic-engineering-playbook.git
cd agentic-engineering-playbook
npm install

# mock model, Markdown report
node src/run.js scenarios/banking-app.json --metrics data/metrics.json

# Markdown + HTML report
node src/run.js scenarios/banking-app.json --metrics data/metrics.json --html

# Add external signals + auto-generated Playwright tests
node src/run.js scenarios/banking-app.json --metrics data/metrics.json --signals config/signals.json --html --run-tests

# Shortcut for metrics + html + tests
npm run demo:tests

Use OpenAI instead of mock:

cp config/model.example.json config/model.json   # add your key
node src/run.js scenarios/banking-app.json --metrics data/metrics.json

How to Extend

  • Add more scenarios under scenarios/*.json with goal, constraints, techStack, and inputs.repoSignals.
  • Teach agents to read real repo files or CI logs (hook into src/tools.js).
  • Point signals at your GitHub/Sonar/Fortify endpoints in config/signals.json.
  • Keep tests: run npx playwright install chromium once; then --run-tests or npm run demo:tests will generate and execute specs.

Notes on other LLMs (Claude, Gemini, etc.)

The flow is model-agnostic. models.js exposes a generate method and picks a model. To add another provider:

  • Implement a new model class (e.g., ClaudeModel, GeminiModel) mirroring OpenAIModel (take key/model name, call the provider’s chat endpoint, return text).
  • Update selectModel() to check ANTHROPIC_API_KEY or GEMINI_API_KEY (or config/model.json) before falling back to mock.
  • Add provider-specific settings (max tokens, safety filters) to config/model.json.

Agents stay unchanged—they just call generate().

Closing Thought

Agentic AI doesn’t have to be abstract. A small, auditable chain—fed by your signals and capped with tests—shows how AI can assist engineering, QA, and platform teams without hand-waving. Start with mock mode, layer in your real signals, then graduate to your preferred model when you’re ready.