This tutorial expands on the playbook from “Designing Reliable AI Hand-offs” with a runnable Node.js project. You will stand up a lightweight moderation service that routes risky decisions to a human reviewer while logging telemetry for future model tuning.
What we will build
- An express-style API (
Fastify
) that accepts content submissions. - A mock AI classifier to flag submissions for review.
- A SQLite-backed review queue with a minimal CLI so you can role-play the reviewer.
- A telemetry sink that records every decision—accepted, edited, or rejected.
The goal is to give you a scaffolding you can extend with real model endpoints, ticketing systems, or messaging queues.
Prerequisites
- Node.js 20+
- pnpm or npm
- SQLite (bundled with macOS/Linux; Windows users can install from sqlite.org)
Project setup
mkdir hitl-moderation && cd hitl-moderation
pnpm init -y
pnpm add fastify fastify-sensible better-sqlite3 pino pino-pretty nanoid zod
pnpm add -D typescript tsx @types/node
Initialize TypeScript and basic scripts:
pnpm exec tsc --init --rootDir src --outDir dist --moduleResolution node --module esnext --target es2022 --esModuleInterop
Update package.json
scripts:
"scripts": {
"dev": "tsx watch src/server.ts",
"reviewer": "tsx src/reviewer-cli.ts"
}
Create the directory structure:
mkdir -p src/lib src/routes src/store
Data access layer
src/store/db.ts
import Database from 'better-sqlite3';
export const db = new Database('moderation.db');
db.pragma('journal_mode = WAL');
db.exec(`
CREATE TABLE IF NOT EXISTS submissions (
id TEXT PRIMARY KEY,
payload TEXT NOT NULL,
ai_decision TEXT NOT NULL,
ai_confidence REAL NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE IF NOT EXISTS reviews (
id TEXT PRIMARY KEY,
submission_id TEXT NOT NULL,
reviewer TEXT NOT NULL,
disposition TEXT NOT NULL,
notes TEXT,
created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (submission_id) REFERENCES submissions(id)
);
`);
src/store/submissions.ts
import { db } from './db.js';
import { nanoid } from 'nanoid';
const insertStmt = db.prepare(`
INSERT INTO submissions (id, payload, ai_decision, ai_confidence, status)
VALUES (@id, @payload, @aiDecision, @aiConfidence, @status)
`);
export const createSubmission = ({ payload, aiDecision, aiConfidence }: {
payload: Record<string, unknown>;
aiDecision: string;
aiConfidence: number;
}) => {
const id = nanoid();
insertStmt.run({
id,
payload: JSON.stringify(payload),
aiDecision,
aiConfidence,
status: aiConfidence < 0.75 || aiDecision === 'escalate' ? 'pending' : 'auto',
});
return id;
};
export const listPending = () => db.prepare(`
SELECT id, payload, ai_decision as aiDecision, ai_confidence as aiConfidence, status
FROM submissions WHERE status = 'pending' ORDER BY created_at ASC
`).all();
export const markCompleted = (id: string, disposition: string) => db.prepare(`
UPDATE submissions SET status = @disposition WHERE id = @id
`).run({ id, disposition });
src/store/reviews.ts
import { db } from './db.js';
import { nanoid } from 'nanoid';
const insertReview = db.prepare(`
INSERT INTO reviews (id, submission_id, reviewer, disposition, notes)
VALUES (@id, @submissionId, @reviewer, @disposition, @notes)
`);
export const recordReview = ({
submissionId,
reviewer,
disposition,
notes,
}: {
submissionId: string;
reviewer: string;
disposition: string;
notes?: string;
}) => {
const id = nanoid();
insertReview.run({ id, submissionId, reviewer, disposition, notes });
return id;
};
Mock AI classifier
src/lib/moderator.ts
const riskyKeywords = ['payment', 'ssn', 'credit card', 'password'];
type ModerationDecision = {
decision: 'auto' | 'escalate';
confidence: number;
};
export const moderate = (text: string): ModerationDecision => {
const normalized = text.toLowerCase();
const hit = riskyKeywords.find((keyword) => normalized.includes(keyword));
if (hit) {
return { decision: 'escalate', confidence: 0.6 };
}
return { decision: 'auto', confidence: 0.92 };
};
Telemetry helper
src/lib/telemetry.ts
import pino from 'pino';
export const telemetry = pino({ transport: { target: 'pino-pretty' } });
Fastify route
src/routes/moderation.ts
import { type FastifyInstance } from 'fastify';
import { z } from 'zod';
import { moderate } from '../lib/moderator.js';
import { createSubmission } from '../store/submissions.js';
import { telemetry } from '../lib/telemetry.js';
const submissionSchema = z.object({
summary: z.string().min(4),
content: z.string().min(8),
submittedBy: z.string().min(2),
});
export default async function moderationRoutes(app: FastifyInstance) {
app.post('/api/moderate', async (request, reply) => {
const parsed = submissionSchema.safeParse(request.body);
if (!parsed.success) {
return reply.badRequest(parsed.error.issues.map((issue) => issue.message).join(', '));
}
const { summary, content, submittedBy } = parsed.data;
const result = moderate(`${summary} ${content}`);
const submissionId = createSubmission({
payload: { summary, content, submittedBy },
aiDecision: result.decision,
aiConfidence: result.confidence,
});
telemetry.info({ event: 'submission.created', submissionId, result });
return reply.send({ submissionId, result });
});
}
Server bootstrap
src/server.ts
import Fastify from 'fastify';
import sensible from 'fastify-sensible';
import moderationRoutes from './routes/moderation.js';
import './store/db.js'; // ensures tables exist
const app = Fastify({ logger: true });
app.register(sensible);
app.register(moderationRoutes);
const start = async () => {
try {
await app.listen({ port: 3000 });
app.log.info('Moderation API running at http://localhost:3000');
} catch (err) {
app.log.error(err);
process.exit(1);
}
};
start();
Reviewer CLI
src/reviewer-cli.ts
import readline from 'node:readline/promises';
import { stdin as input, stdout as output } from 'node:process';
import { listPending, markCompleted } from './store/submissions.js';
import { recordReview } from './store/reviews.js';
import { telemetry } from './lib/telemetry.js';
const rl = readline.createInterface({ input, output });
const prompt = async (question: string) => rl.question(`${question}: `);
const main = async () => {
const items = listPending();
if (!items.length) {
console.log('No pending submissions.');
rl.close();
return;
}
for (const item of items) {
console.log('\nSubmission:', item.id);
const payload = JSON.parse(item.payload as string);
console.log(JSON.stringify(payload, null, 2));
console.log(`AI decision: ${item.aiDecision} (${(item.aiConfidence * 100).toFixed(1)}%)`);
const disposition = await prompt('Disposition (approve/flag/reject)');
const notes = await prompt('Reviewer notes');
const reviewer = await prompt('Reviewer name');
recordReview({
submissionId: item.id,
reviewer,
disposition,
notes,
});
markCompleted(item.id, disposition);
telemetry.info({
event: 'review.completed',
submissionId: item.id,
disposition,
reviewer,
});
}
rl.close();
};
main().catch((error) => {
console.error(error);
rl.close();
});
Try it out
- Start the API:
pnpm dev
- Submit content (API client, curl, or HTTPie):
curl -X POST http://localhost:3000/api/moderate \
-H 'Content-Type: application/json' \
-d '{
"summary": "Support ticket",
"content": "Customer emailed their credit card number for a refund.",
"submittedBy": "agent-44"
}'
The response shows the AI decision and submission ID. Anything risky remains pending
.
- Run the reviewer CLI in another terminal:
pnpm reviewer
- Record the human decision, then inspect telemetry output in your API terminal. You will see structured logs documenting both the AI suggestion and the reviewer override.
Next steps
- Replace the keyword-based classifier with a hosted LLM or in-house moderation endpoint.
- Swap SQLite for your queue of choice: Postgres, DynamoDB, or RabbitMQ.
- Push telemetry to Kafka or a warehouse, then build dashboards for the scorecard metrics.
- Add authentication, rate limiting, and a React or Next.js reviewer console.
By layering this Node.js loop beneath your HITL workflow, you transform the conceptual scorecard into a concrete platform that combines AI assistance, human judgment, and observability. Iterate on the data and UI, keep feedback cycles tight, and your users will feel the difference.