name: audit-documentation description: Run a multi-stage parallel documentation audit with 18 specialized agents
Multi-Stage Parallel Documentation Audit
Version: 2.0 Total Agents: 18 parallel agents across 5 stages + 1 synthesis stage
Overview
This audit uses parallel agent execution across 6 stages to comprehensively analyze documentation quality, accuracy, and lifecycle status. Each stage produces JSONL output that feeds into the final synthesis.
Output Directory:
docs/audits/single-session/documentation/audit-[YYYY-MM-DD]/
Pre-Audit Setup
Step 0: Episodic Memory Search (Session #128)
Before running documentation audit, search for context from past sessions:
// Search for past documentation audit findings
mcp__plugin_episodic -
memory_episodic -
memory__search({
query: ["documentation audit", "stale docs", "broken links"],
limit: 5,
});
// Search for doc structure decisions
mcp__plugin_episodic -
memory_episodic -
memory__search({
query: ["DOCUMENTATION_STANDARDS", "tier", "lifecycle"],
limit: 5,
});
Why this matters:
- Compare against previous doc health metrics
- Identify recurring documentation gaps
- Track which docs were flagged for updates before
- Prevent re-flagging known orphans or intentional gaps
Step 1: Create Output Directory
AUDIT_DIR="docs/audits/single-session/documentation/audit-$(date +%Y-%m-%d)"
mkdir -p "$AUDIT_DIR"
echo "Audit output: $AUDIT_DIR"
Step 2: Load False Positives Database
Read docs/audits/FALSE_POSITIVES.jsonl and note patterns to exclude from
findings (filter by category: documentation).
Step 3: Check Thresholds
Run npm run review:check - proceed regardless of result (user invoked
intentionally).
Stage 1: Inventory & Baseline (3 Parallel Agents)
Launch these 3 agents in parallel:
Agent 1A: Document Inventory
Task: Build complete document catalog
Count all .md files by directory and tier:
- Root level: ROADMAP.md, README.md, etc.
- docs/: by subdirectory
- .claude/: skills, plans
Extract metadata from each:
- Version number (if present)
- Last Updated date (if present)
- Status field (if present)
- Word count
Output: ${AUDIT_DIR}/stage-1-inventory.md
Format: Markdown summary with counts and file list
Agent 1B: Baseline Metrics
Task: Capture current state via existing tools
# Run these commands and capture output:
npm run docs:check > ${AUDIT_DIR}/baseline-docs-check.txt 2>&1
npm run docs:sync-check > ${AUDIT_DIR}/baseline-sync-check.txt 2>&1
npm run format:check -- docs/ > ${AUDIT_DIR}/baseline-format-check.txt 2>&1
# Check DOCUMENTATION_INDEX.md for orphans
grep -c "orphan" docs/DOCUMENTATION_INDEX.md || echo "0"
Output: ${AUDIT_DIR}/stage-1-baselines.md
Agent 1C: Link Extraction
Task: Build link graph for later stages
Extract from all .md files:
1. Internal links: [text](path.md) -> list with source file:line
2. External URLs: https://... -> list with source file:line
3. Anchor links: #section -> list with source file:line
Output: ${AUDIT_DIR}/stage-1-links.json
Schema:
{
"internal": [{"source": "file.md", "line": 1, "target": "other.md", "text": "..."}],
"external": [{"source": "file.md", "line": 1, "url": "https://...", "text": "..."}],
"anchors": [{"source": "file.md", "line": 1, "anchor": "#section", "text": "..."}]
}
Stage 1 Completion Audit
Before proceeding to Stage 2, verify:
-
stage-1-inventory.mdexists and is non-empty -
stage-1-baselines.mdexists with metrics -
stage-1-links.jsonexists and is valid JSON - Display summary: "Stage 1 Complete: X docs, Y internal links, Z external URLs"
Stage 2: Link Validation (4 Parallel Agents)
Launch these 4 agents in parallel using Stage 1 outputs:
Agent 2A: Internal Link Checker
Task: Verify internal .md links resolve
For each internal link from stage-1-links.json:
1. Check target file exists
2. If link has anchor (#section), verify heading exists in target
3. Detect circular references (A→B→C→A)
Output: ${AUDIT_DIR}/stage-2-internal-links.jsonl
JSONL schema per finding (JSONL_SCHEMA_STANDARD.md format):
{
"category": "documentation",
"title": "Broken internal link to target.md",
"fingerprint": "documentation::source.md::broken-link-target",
"severity": "S1|S2",
"effort": "E0",
"confidence": 90,
"files": ["source.md:123"],
"why_it_matters": "Broken links frustrate readers and indicate stale documentation",
"suggested_fix": "Update link to correct path or remove if target no longer exists",
"acceptance_tests": ["Link resolves correctly", "No 404 when clicking"],
"evidence": ["target: path.md", "resolved: /full/path.md"]
}
Agent 2B: External URL Checker
Task: HTTP HEAD requests to external URLs
# Use the new script for external link checking
npm run docs:external-links -- --output ${AUDIT_DIR}/stage-2-external-links.jsonl
Or manually check each URL from stage-1-links.json with:
- 10-second timeout
- Rate limiting (100ms between same domain)
- Cache results
- Flag: 404, 403, 5xx, timeouts, redirects
Output: ${AUDIT_DIR}/stage-2-external-links.jsonl
Agent 2C: Cross-Reference Validator
Task: Verify references to project artifacts
Check documentation references:
1. ROADMAP item references (P1.2, Phase 3, etc.) - do they exist?
2. PR/Issue references (#123) - format valid?
3. SESSION_CONTEXT references - files mentioned exist?
4. Skill/hook path references - paths valid?
Output: ${AUDIT_DIR}/stage-2-cross-refs.jsonl
Agent 2D: Orphan & Connectivity
Task: Find disconnected documents
From stage-1-links.json, identify:
1. Docs with zero inbound links (orphans)
2. Docs with only broken outbound links
3. Isolated clusters (group of docs only linking to each other)
Exclude from orphan detection:
- README.md (entry point)
- Root-level canonical docs
- Archive docs
Output: ${AUDIT_DIR}/stage-2-orphans.jsonl
Stage 2 Completion Audit
Before proceeding to Stage 3, verify:
- All 4 JSONL files exist
- Run schema validation:
node scripts/debt/validate-schema.js ${AUDIT_DIR}/stage-2-*.jsonl - Display summary: "Stage 2 Complete: X link issues found"
Stage 3: Content Quality (4 Parallel Agents)
Launch these 4 agents in parallel:
Agent 3A: Accuracy Checker
Task: Verify content matches codebase
# Use the new script for accuracy checking
node scripts/check-content-accuracy.js --output ${AUDIT_DIR}/stage-3-accuracy.jsonl
Checks:
- Version numbers match package.json
- File paths mentioned exist
- npm script references valid
- Code snippet syntax (basic validation)
Output: ${AUDIT_DIR}/stage-3-accuracy.jsonl
Agent 3B: Completeness Checker
Task: Check for missing/incomplete content
For each document, check:
1. Required sections present per tier:
- Tier 1: Purpose, Version History
- Tier 2: Purpose, Version History, AI Instructions
- Tier 3+: Purpose, Status, Version History
2. No TODO/TBD/FIXME placeholders
3. No empty sections (heading with no content)
4. No stub documents (< 100 words, excluding code blocks)
Output: ${AUDIT_DIR}/stage-3-completeness.jsonl
Agent 3C: Coherence Checker
Task: Check terminology and duplication
Analyze across all documents:
1. Terminology inconsistency:
- "skill" vs "command" vs "slash command"
- "agent" vs "subagent" vs "worker"
- Collect all term usages, flag inconsistencies
2. Duplicate content:
- Exact match: identical content blocks (>50 words)
- Fuzzy match: 80%+ similarity (same topic, minor rewording)
3. Contradictory information (conflicting guidance for same task)
Output: ${AUDIT_DIR}/stage-3-coherence.jsonl
Agent 3D: Freshness Checker
Task: Check for stale content
# Use the new script for placement/staleness
npm run docs:placement -- --output ${AUDIT_DIR}/stage-3-freshness.jsonl
Tier-specific staleness thresholds:
- Tier 1 (Canonical): >60 days
- Tier 2 (Foundation): >90 days
- Tier 3+ (Planning/Reference/Guides): >120 days
Additional checks:
- Outdated version references
- Deprecated terminology still used
Output: ${AUDIT_DIR}/stage-3-freshness.jsonl
Stage 3 Completion Audit
Before proceeding to Stage 4, verify:
- All 4 JSONL files exist
- Schema validation passes
- Display summary: "Stage 3 Complete: X content quality issues"
Stage 4: Format & Structure (3 Parallel Agents)
Launch these 3 agents in parallel:
Agent 4A: Markdown Lint
Task: Run markdownlint on all docs
# Note: docs:lint should lint all markdown locations:
# "*.md" "docs/**/*.md" ".claude/**/*.md"
npm run docs:lint > ${AUDIT_DIR}/markdownlint-raw.txt 2>&1
# Parse output into JSONL findings
# Each markdownlint violation becomes a finding
Convert violations to JSONL format in ${AUDIT_DIR}/stage-4-markdownlint.jsonl
Agent 4B: Prettier Compliance
Task: Check Prettier formatting
npm run format:check -- docs/ > ${AUDIT_DIR}/prettier-raw.txt 2>&1
# Parse output for files that need formatting
Convert violations to JSONL format in ${AUDIT_DIR}/stage-4-prettier.jsonl
Agent 4C: Structure Standards
Task: Check document structure conventions
For each document, verify:
1. Frontmatter present and valid (for skill docs)
2. Required headers per tier
3. Version history format (table with Version|Date|Description)
4. Table formatting consistency (aligned pipes)
5. Code block language tags (all ``` blocks have language)
6. Heading uniqueness (no duplicate headings in same doc)
Output: ${AUDIT_DIR}/stage-4-structure.jsonl
Stage 4 Completion Audit
Before proceeding to Stage 5, verify:
- All 3 JSONL files exist
- Schema validation passes
- Display summary: "Stage 4 Complete: X format issues"
Stage 5: Placement & Lifecycle (4 Parallel Agents)
Launch Agents 5A, 5B, 5C in parallel, then 5D sequentially after 5B completes:
Agent 5A: Location Validator
Task: Check documents in correct directories
Verify placement rules:
- Plans → docs/plans/ or .planning/
- Archives → docs/archive/
- Templates → docs/templates/
- Audits → docs/audits/
- Tier 1 → root level
- Tier 2 → docs/ or root
Output: ${AUDIT_DIR}/stage-5-location.jsonl
Agent 5B: Archive Candidate Finder (Surface-Level)
Task: Quick scan for archive candidates
Surface-level detection:
1. Completed plans not archived (status: completed)
2. Session handoffs > 30 days old
3. Old audit results (> 60 days, likely in MASTER_DEBT.jsonl already)
4. Plans not referenced in current ROADMAP.md
Output: ${AUDIT_DIR}/stage-5-archive-candidates-raw.jsonl
Agent 5C: Cleanup Candidate Finder
Task: Find files that should be deleted/merged
Identify:
1. Exact duplicate files (same content hash)
2. Near-empty files (< 50 words)
3. Draft files > 60 days old
4. Temp/test files (names starting with temp, test, scratch)
5. Merge candidates (fragmented docs on same topic)
Output: ${AUDIT_DIR}/stage-5-cleanup-candidates.jsonl
Agent 5D: Deep Lifecycle Analysis (Runs After 5B)
Sequential dependency: Read 5B output first
Task: Detailed analysis of archive candidates
For each candidate from stage-5-archive-candidates-raw.jsonl:
1. Read the actual document content
2. Determine original purpose
3. Assess current status:
- Purpose met? (completed successfully)
- Overtaken? (superseded by other work)
- Deprecated? (no longer relevant)
4. Check if content was consumed:
- Audit findings → in MASTER_DEBT.jsonl?
- Plan outcomes → documented elsewhere?
5. Provide recommendation with justification
Output: ${AUDIT_DIR}/stage-5-lifecycle-analysis.jsonl
Extended schema:
{
...standard fields...,
"purpose": "Original intent of the document",
"status_reason": "Why marked for archive",
"consumed_by": "Where content lives now (if applicable)",
"recommendation": "ARCHIVE|DELETE|KEEP|MERGE_INTO:<target>"
}
Stage 5 Completion Audit
Before proceeding to Stage 6, verify:
- All 4 JSONL files exist (5A, 5B raw, 5C, 5D analysis)
- Schema validation passes
- Display summary: "Stage 5 Complete: X lifecycle issues, Y archive candidates"
Stage 6: Synthesis & Prioritization (Sequential)
This stage runs sequentially after all parallel stages complete.
Step 6.1: Merge All Findings
# Combine all stage outputs
cat ${AUDIT_DIR}/stage-2-*.jsonl \
${AUDIT_DIR}/stage-3-*.jsonl \
${AUDIT_DIR}/stage-4-*.jsonl \
${AUDIT_DIR}/stage-5-location.jsonl \
${AUDIT_DIR}/stage-5-archive-candidates-raw.jsonl \
${AUDIT_DIR}/stage-5-cleanup-candidates.jsonl \
${AUDIT_DIR}/stage-5-lifecycle-analysis.jsonl > ${AUDIT_DIR}/all-findings-raw.jsonl
Step 6.2: Deduplicate
Input: ${AUDIT_DIR}/all-findings-raw.jsonl Output:
${AUDIT_DIR}/all-findings-deduped.jsonl
Remove duplicates where same file:line appears from multiple agents.
Keep the finding with:
1. Higher severity
2. Higher confidence
3. More evidence items
Step 6.3: Cross-Reference FALSE_POSITIVES.jsonl
Input: ${AUDIT_DIR}/all-findings-deduped.jsonl Output:
${AUDIT_DIR}/all-findings.jsonl (final file for TDMS intake)
Filter out findings matching patterns in docs/audits/FALSE_POSITIVES.jsonl:
- Match by file pattern
- Match by title pattern
- Check expiration dates
Step 6.4: Priority Scoring
For each finding, calculate priority:
priority = (severityWeight × categoryMultiplier × confidenceWeight) / effortWeight
Where:
- severityWeight: S0=100, S1=50, S2=20, S3=5
- categoryMultiplier: links=1.5, accuracy=1.3, freshness=1.0, format=0.8
- confidenceWeight: HIGH=1.0, MEDIUM=0.7, LOW=0.4
- effortWeight: E0=1, E1=2, E2=4, E3=8
Sort findings by priority descending.
Step 6.5: Generate Action Plan
Create three queues:
1. IMMEDIATE FIXES (S0/S1, E0/E1):
- List with specific file:line and fix command
2. ARCHIVE QUEUE:
- node scripts/archive-doc.js commands for each candidate
3. DELETE/MERGE QUEUE:
- Justification for each deletion
- Merge target for consolidations
Step 6.6: Generate Final Report
Output: ${AUDIT_DIR}/FINAL_REPORT.md
# Documentation Audit Report - [DATE]
## Executive Summary
- **Total findings:** X
- **By severity:** S0: X, S1: X, S2: X, S3: X
- **By category:** Links: X, Content: X, Format: X, Lifecycle: X
- **False positives filtered:** X
## Baseline Comparison
| Metric | Before | After Fixes |
| -------------------- | ------ | ----------- |
| docs:check errors | X | - |
| docs:sync issues | X | - |
| Orphaned docs | X | - |
| Stale docs (>90 day) | X | - |
## Top 20 Priority Items
| # | Severity | File | Issue | Effort |
| --- | -------- | ---- | ----- | ------ |
| 1 | S1 | ... | ... | E0 |
## Stage-by-Stage Breakdown
### Stage 2: Link Validation
- Internal link errors: X
- External link errors: X
- Orphaned documents: X
### Stage 3: Content Quality
- Accuracy issues: X
- Completeness issues: X
- Coherence issues: X
- Freshness issues: X
### Stage 4: Format & Structure
- Markdownlint violations: X
- Prettier violations: X
- Structure issues: X
### Stage 5: Lifecycle
- Location issues: X
- Archive candidates: X
- Cleanup candidates: X
## Action Plan
### Immediate Fixes (Do Now)
1. `file.md:line` - Fix description
### Archive Queue
```bash
node scripts/archive-doc.js "path/to/doc.md"
```
Cleanup Queue
- DELETE:
path/to/temp-file.md(reason) - MERGE:
fragmented.md→main-doc.md
Recommendations
- ...
- ...
Post-Audit Actions
1. Save Outputs
Verify all files saved to ${AUDIT_DIR}/:
- stage-1-*.md, stage-1-links.json
- stage-2-*.jsonl
- stage-3-*.jsonl
- stage-4-*.jsonl
- stage-5-*.jsonl
- all-findings.jsonl (merged, deduplicated)
- FINAL_REPORT.md
2. TDMS Integration
node scripts/debt/intake-audit.js ${AUDIT_DIR}/all-findings.jsonl --source "audit-documentation-$(date +%Y-%m-%d)"
3. Update AUDIT_TRACKER.md
Add entry to "Documentation Audits" table:
| Date | Session | Commits | Files | Findings | Confidence | Validation |
|---|---|---|---|---|---|---|
| [today] | [#] | [X] | [Y] | [summary] | HIGH | PASSED |
4. Reset Threshold
Single-session audits reset the documentation category threshold.
5. Offer Fixes
Ask user: "Would you like me to fix any immediate items now?"
Category Mapping for TDMS
| Stage | Category ID Prefix | TDMS Category |
|---|---|---|
| 2 - Links | DOC-LINK-* | documentation |
| 3 - Content | DOC-CONTENT-* | documentation |
| 4 - Format | DOC-FORMAT-* | documentation |
| 5 - Lifecycle | DOC-LIFECYCLE-* | documentation |
Recovery Procedures
If Stage Fails
- Missing output file: Re-run specific agent with explicit file write
- Empty output file: Check agent for errors, re-run with verbose
- Schema validation fails: Parse errors line-by-line, fix malformed
- Context compaction: Verify AUDIT_DIR path, re-run from last checkpoint
If Context Compacts Mid-Audit
Read the partial outputs already saved to ${AUDIT_DIR}/ and resume from the
last completed stage.
Multi-AI Escalation
After 3 single-session documentation audits, a full multi-AI Documentation Audit is recommended. Track in AUDIT_TRACKER.md "Single audits completed" counter.
Update Dependencies
When modifying this skill, also update:
| Document | Section |
|---|---|
docs/templates/MULTI_AI_DOCUMENTATION_AUDIT_TEMPLATE.md | Sync category list |
docs/SLASH_COMMANDS_REFERENCE.md | /audit-documentation reference |
Version History
| Version | Date | Description |
|---|---|---|
| 2.0 | 2026-02-02 | Complete rewrite: 6-stage parallel audit with 18 agents |
| 1.0 | 2025-xx-xx | Original single-session sequential audit |
Documentation References
Before running this audit, review:
TDMS Integration (Required)
- PROCEDURE.md - Full TDMS workflow
- MASTER_DEBT.jsonl - Canonical debt store
- Intake command:
node scripts/debt/intake-audit.js <output.jsonl> --source "audit-documentation-<date>"
Documentation Standards (Critical for This Audit)
- JSONL_SCHEMA_STANDARD.md - Output format requirements and TDMS field mapping
- DOCUMENTATION_STANDARDS.md - The canonical guide this audit validates against (5-tier hierarchy, metadata requirements, quality protocols)