LLM Knowledge Bases
Use this guide when Codex should operate a Vault managed by the LLM Knowledge Bases MCP server.
Core model:
raw/stores captured source materialwiki/stores the persistent knowledge layerwiki/sources/stores compiled source pageswiki/outputs/stores archived answerswiki/concepts/,wiki/entities/, andwiki/syntheses/store durable derived pageswiki/_indexes/,wiki/index.md, andwiki/log.mdkeep the vault navigable.llm-kb/representations/stores runtime-managed OCR, vision, metadata, and profiling artifacts for non-text assets
Boundaries:
- treat the Vault as runtime-managed
- use MCP tools for all Vault reads and writes
- never modify
raw/ - never write directly into
wiki/or.llm-kb/representations/with generic file tools - never invent IDs, hashes, representation paths, or
source_refs - use
kb_read_rawonly for text-readable raw files - use the representation-first path for PDFs and images
Required MCP tools:
kb_statuskb_list_rawkb_read_rawkb_get_raw_assetkb_prepare_sourcekb_prepare_source_bundlekb_prepare_representationkb_upsert_representationkb_read_representationskb_upsert_source_notekb_prepare_outputkb_upsert_outputkb_prepare_derived_notekb_upsert_derived_notekb_searchkb_read_noteskb_map_gapskb_promote_gapkb_repair_source_idskb_rebuild_indexeskb_lint
Canonical actions:
-
ingest-sourcekb_statuskb_list_raw(changed_only=true)- for text/data:
kb_prepare_source->kb_read_raw - for PDFs/images:
kb_prepare_source_bundle->kb_get_raw_asset-> representation tools ->kb_read_representations - write one grounded source page
kb_upsert_source_notekb_rebuild_indexes
-
ask-and-filekb_searchkb_read_notes- answer only from retrieved notes
- use
kb_upsert_outputfor query-specific archives - use
kb_upsert_derived_notefor durable concept/entity/synthesis pages
-
maintain-wikikb_lint- inspect indexes and relevant pages
- use
kb_repair_source_idsfirst when source ids, manifest entries, source note paths, or raw hashes have drifted - repair narrowly through runtime tools
kb_rebuild_indexes
-
map-gapskb_searchkb_read_noteskb_map_gaps- optionally
kb_promote_gapwhen the best current candidate should be landed immediately
Natural-language triggers:
The explicit prefix $llm-knowledge-bases is optional.
Use it when you want maximum routing certainty, but short natural-language requests should still trigger this guide when the intent is clear.
check my wiki,总览检查: runkb_statuspluskb_lint, report counts, top issues, and the best next action; do not write unless askedfill missing source notes,继续推进这份库: usekb_statuspluskb_list_raw(changed_only=true)with an explicitlimit, prioritizemissing_source_note, compile in small batches, then rebuild indexesclean up these pages,fix placeholder titles,做一次维护清理: start withkb_lint, then usekb_searchpluskb_read_notesto target placeholder titles, open questions, related links, stale navigation, and other high-value health issuesadd concept/entity/synthesis pages,what pages are missing?: search first, read evidence, then runkb_map_gaps; usekb_promote_gapwhen the candidate can be landed directlyrepair source ids,repair manifest drift: runkb_repair_source_idsas a dry run first, explain the plan, then apply only if it looks correctanswer this from the wiki and save it back: useask-and-file; preferconcept/entity/synthesisoveroutputwhen the result is reusable beyond the current query
Chinese routing hints:
看一下,检查一下,盘一下,总览,先看看: inspection-first, usuallykb_statuspluskb_lint先别改,只看不改,先给我报告: read-only mode unless the user later asks for writes补缺失,补书评,编译缺的 source:ingest-sourcefocused onmissing_source_note整理一下,清理一下,修一下占位内容:maintain-wikiwith placeholder cleanup and topic filters补概念页,补 entity,补 synthesis,沉淀成页面: derived-page flow; always read evidence before writing修漂移,修 source id,修 manifest,先 dry run:kb_repair_source_idsdry run first继续推进这份库,接着跑一轮: conservative continuation batch, not a giant rewrite
Continuation default when the request is underspecified:
kb_statuskb_list_raw(changed_only=true)with an explicitlimitkb_lint- optional topic-targeted
kb_searchpluskb_read_notes - choose one primary batch: either a small
missing_source_notecompile batch or a small placeholder-repair batch - optionally add one high-confidence derived page
kb_rebuild_indexes
Scenario presets:
整理一下 AI 相关内容,只修 AI 相关,补 AI 概念页runkb_lint, then topic-focusedkb_searchpluskb_read_notes; repair a small batch of placeholder-heavy notes first, then optionally land one high-confidence derived page补书评,继续编译书评,书评批处理scope toraw/书评 1/text raw files, prioritizemissing_source_note, compile up to 10 notes, then rebuild indexes继续推进我的这份库,接着跑一轮choose one primary batch only: either up to 10 source-note compiles, up to 5 placeholder repairs, or 1 derived page
When 继续推进这份库 is underspecified, prefer whichever of these is most clearly dominant in kb_status plus kb_lint.
One-line shortcuts that should be treated as complete enough requests:
用 $llm-knowledge-bases 检查一下我的 wiki,先别改用 $llm-knowledge-bases 补书评前 10 个用 $llm-knowledge-bases 整理一下 AI 相关内容用 $llm-knowledge-bases 补 3 个 concept pages用 $llm-knowledge-bases 修一下 source id 漂移,先 dry run用 $llm-knowledge-bases 继续推进我的这份库
The same one-line requests should also work without the $llm-knowledge-bases prefix when the surrounding context is clearly about this wiki.
Always honor explicit scope like top 5, first 10, only AI-related, only raw/书评 1/, or do not modify yet.
Writing rules:
- source pages need
Summary,Key Points,Evidence,Open Questions,Related Links - multimodal source pages should keep
raw_kind,mime_type, andasset_pathsaligned with the reviewed asset trail - PDF/image source pages should usually include
Visual Noteswhen the review evidence is not already obvious from stored representations - output pages need
Answer,Sources Used,Follow-up Questions - concept pages need
Summary,Definition,Key Points,Evidence,Open Questions,Related Notes - entity pages need
Summary,Who or What,Key Facts,Evidence,Open Questions,Related Notes - synthesis pages need
Summary,Thesis,Supporting Evidence,Tensions,Open Questions,Related Notes
Finish by stating:
- what was ingested, answered, or maintained
- which MCP tools were used
- which pages were created or updated
- any unresolved ambiguity or weak evidence