name: review-writing description: # Review Writing — 学术综述逐节写作方法论
Review Writing — 学术综述逐节写作方法论
Use this skill when the user asks to write a literature review, review article, or 综述 based on an outline. Trigger keywords: "写综述", "write review", "综述写作", "按大纲写", "逐节写", "review section", "写第N节". This skill orchestrates the ENTIRE review writing process from outline to finished manuscript.
This skill calls academic-literature-search skill for all search and citation operations. Read that skill first if not already loaded.
Tool routing: PubMed operations → MCP tools (pubmed_search_articles, pubmed_fetch_contents, pubmed_article_connections). arXiv search, GB/T 7714 formatting, citation processing → Python code. See academic-literature-search for the complete routing table and code templates.
Architecture: Why Section-by-Section
A full review (12,000–15,000 words, 100–130 references) CANNOT be written in one pass due to context window limits. The correct approach:
Outline
→ [Phase 0: Validate & Revise outline]
→ [Phase 1..N: Per-section pipeline]
→ [Final: Assemble full review]
Each section is an independent unit of work:
Search → Filter → Group → Write → Cite → Save to file
↓
section_N.md (persisted immediately)
Final assembly reads all section files → cross-section dedup → unified numbering
Phase 0: Outline Validation & Revision (大纲验证与修订)
DO NOT skip this phase. No outline is perfect before reading the literature.
Step 0.1: Read the outline
Read the user's outline file. Parse each section's title, sub-topics, and any pre-identified references.
Step 0.2: Scout search (侦察检索)
For each section, run ONE quick search using the section title/topic as query:
- MCP
pubmed_search_articles(maxResults=10, fetchBriefSummaries=5) - For CS/AI-heavy sections: also Python
search_arxiv()(max_results=5) - For sections with known seed papers: MCP
pubmed_article_connections(similar, maxRelatedResults=5)
Step 0.3: Evaluate and suggest revisions
Based on scout results, produce a validation report:
## 大纲验证报告
### §1 [section title]
- 检索命中: PubMed X篇, arXiv Y篇
- 代表性论文:
- "Paper Title A" (Journal, Year) — [关系:直接相关]
- "Paper Title B" (Journal, Year) — [关系:方法论参考]
- "Paper Title C" (Journal, Year) — [关系:最新进展]
- 评估: ✅ 文献充足 / ⚠️ 偏少建议扩展 / 🔴 极少建议合并或调整
- 建议: [specific suggestion if any]
### §2 ...
### 整体建议
- 建议新增: [topic] — 检索发现大量文献但大纲未覆盖
- 建议合并: §X 和 §Y 文献高度重叠
- 建议拆分: §Z 文献过于丰富,建议拆为两节
Step 0.4: User confirms revised outline
Wait for user to confirm or further adjust. Only proceed to Phase 1 after outline is finalized.
Pre-Writing: Thesis Reference Ingestion (正文引用复用)
If the user's thesis body already has references (like the 论文正文——第一二部分合并.md), BEFORE starting Phase 1:
- Extract all references from the thesis body (PMIDs, DOIs, author-year citations)
- Fetch their full metadata via MCP
pubmed_fetch_contents - Store as a seed reference pool — when the same paper appears in review search results, reuse this metadata exactly (ensures consistency between thesis body and review)
- When writing review sections, if a thesis-body reference is relevant, cite it directly from the seed pool without re-searching
Phase 1–N: Per-Section Writing (逐节写作)
For each section, execute the full pipeline below. One section per conversation turn.
Step 1: Deep Search (深度检索)
Generate 3-5 targeted search queries based on the section's sub-topics. Then:
For biomedical-heavy sections (§1, §2, §5, §7):
- MCP
pubmed_search_articles(maxResults=15, dateRange minDate="2020", fetchBriefSummaries=10) - MCP
pubmed_search_articles(queryTerm="X AND biorxiv[journal]", maxResults=5) — 预印本
For CS/AI-heavy sections (§3, §4, §6):
- MCP
pubmed_search_articles(maxResults=10) - Python
search_arxiv(query, max_results=10)— ML/AI 会议论文和预印本
For all sections:
3. Deduplicate across all queries: Python deduplicate()
4. For foundational papers in the outline: MCP pubmed_fetch_contents (pmids=[...])
5. For expanding from seed papers: MCP pubmed_article_connections (similar / citedin / references)
6. Check seed reference pool for any already-known relevant papers
Target candidates per section:
- Biomedical sections: 15–25 papers
- CS/AI sections: 20–30 papers (wider net because PubMed coverage is sparser)
Step 2: Filter & Score (筛选评估)
Present search results to the LLM with this instruction:
From the following N search results, select the most relevant papers
for writing a review section about [section topic].
SELECTION TARGETS:
- Biomedical sections: select 10-15 papers
- CS/AI sections: select 15-20 papers
SELECTION CRITERIA (in priority order):
1. RELEVANCE to the specific section topic
2. IMPACT: prefer high-impact journals and highly-cited works
3. RECENCY: prefer 2022-2026, but include seminal older papers
4. DIVERSITY: cover different sub-aspects, not just the most popular finding
5. BALANCE: include both supporting evidence and contrasting viewpoints
For each selected paper, assign ONE role:
- FOUNDATIONAL: establishes the field/concept
- EVIDENCE: provides key experimental/computational evidence
- METHOD: introduces an important method/tool
- COMPARISON: enables comparison between approaches
- GAP: identifies limitations or open problems
- MILESTONE: landmark paper (e.g., AlphaFold, GPT-4)
Output format:
[search_index] [ROLE] — one-sentence reason for inclusion
Step 3: Evidence Grouping (证据分组)
Organize selected papers into narrative groups that will drive paragraph structure:
Group A: "Historical development / Milestones"
Group B: "Current mainstream approaches / Consensus"
Group C: "Emerging advances / Recent breakthroughs"
Group D: "Methodological comparisons"
Group E: "Limitations, controversies, and open problems"
Not every section needs all groups. Choose the groups that fit the section's content.
Step 4: Write Section (撰写本节)
Write a review section about [section topic] using the provided literature.
WRITING RULES:
1. ACADEMIC CHINESE PROSE (学术中文). Scientific terms keep English abbreviations
on first mention: e.g., 检索增强生成(Retrieval-Augmented Generation, RAG).
Subsequent uses can use abbreviation directly.
2. NARRATIVE, NOT LIST. Write flowing paragraphs with logical transitions.
❌ "A研究了X[1]。B研究了Y[2]。C研究了Z[3]。"
✅ "多项研究从不同角度探讨了这一问题。A等[1]首先通过...揭示了...;
在此基础上,B等[2]进一步...;然而,C等[3]的研究指出..."
3. CITE BY INDEX. Reference papers using [N] where N is the paper's index in
the provided source list. Every factual claim MUST have at least one citation.
4. CRITICAL ANALYSIS. Don't just summarize — compare, contrast, evaluate.
Point out methodological differences, conflicting findings, remaining gaps.
❌ "取得了重要进展"、"具有广阔前景"
✅ "将检索精度从 70.1% 提升至 80.7%"、"覆盖了 338 个数据库"
5. SECTION STRUCTURE:
a. 开门点题(1-2句):本节综述什么主题,为什么重要
b. 发展脉络(2-3段):按时间或逻辑组织
c. 现状分析(1-2段):主流方法/共识/争议
d. 批判性评价(1段):现有工作的局限和不足
e. 收束引出(1-2句):指向下一节或研究空白
6. SECTION TRANSITION: The FIRST sentence of this section must logically connect
to the LAST sentence of the previous section. The LAST sentence must set up
the next section's topic. [Agent: verify this after writing.]
7. LENGTH: 1,500-2,500 Chinese characters per section.
8. NO FABRICATION. Only cite papers from the provided source list.
If a fact lacks source support, write "据报道" without citation rather than
fabricating one. NEVER invent PMIDs, DOIs, or author names.
9. JOURNAL NAMES: Use FULL journal names (Nature Medicine, not Nat Med).
Keep this consistent across all sections.
SOURCE LIST:
[paste filtered papers with index, title, authors, year, abstract]
Step 5: Post-Write Checks (写后检查)
After the LLM writes the section, perform these checks:
5a. Citation integration (code):
process_citations()fromacademic-literature-searchskill- Expand multi-citations, remove phantoms, record actually-cited papers
5b. Section transition check (LLM):
- Read the last 2 sentences of the PREVIOUS section file
- Read the first 2 sentences of the current section
- Verify logical connection. If disconnected, suggest revision.
5c. Citation density check (code):
# Count paragraphs and citations
paragraphs = [p for p in section_text.split("\n\n") if p.strip()]
for i, p in enumerate(paragraphs):
cite_count = len(re.findall(r"\[\d+\]", p))
if cite_count == 0 and len(p) > 100:
print(f"WARNING: Paragraph {i+1} has no citations ({len(p)} chars)")
5d. Format reference list (code):
format_gbt7714()for each cited paper- Verify journal names are full names, not abbreviations
Step 6: Save to File (保存)
output_dir/
section_1_[short_name].md
section_2_[short_name].md
...
section_N_[short_name].md
_metadata.json
Each section file:
# [Section Number] [Section Title]
[Section text with [N] citations]
---
## 本节参考文献(临时编号)
[1] Author, et al. Title[J]. Journal, Year, Vol(Issue): Pages. DOI: xxx.
[2] ...
---
<!-- metadata
section_index: 1
cited_papers: [
{"local_index": 1, "pmid": "12345678", "doi": "10.1234/xxx", "title": "...", "source": "pubmed"},
{"local_index": 2, "pmid": "", "doi": "", "arxiv_id": "2210.03629", "title": "...", "source": "arxiv"},
...
]
search_queries: ["query1", "query2", ...]
candidate_count: 25
cited_count: 14
-->
_metadata.json tracks cross-section state:
{
"outline_file": "/path/to/综述大纲.md",
"output_dir": "/path/to/综述输出/",
"sections_completed": [1, 2, 3],
"sections_total": 8,
"all_cited_papers": [
{"pmid": "12345678", "doi": "...", "title": "...", "first_cited_in_section": 1},
...
],
"total_unique_references": 45,
"seed_reference_pool": [...],
"last_updated": "2026-02-27T20:30:00"
}
Final Phase: Assembly (全文组装)
Step F1: Read all section files
Step F2: Cross-section deduplication
Same paper cited in §2 and §5 → ONE reference number. Match by DOI > PMID > normalized title.
Step F3: Unified sequential numbering
MUST be done by code, scanning sections in order:
import re, json
def assemble_review(section_files, output_path):
"""Assemble all sections into final review with unified GB/T 7714 numbering."""
global_refs = []
paper_to_global = {}
global_num = 1
full_text_parts = []
for sf in section_files:
with open(sf) as f:
content = f.read()
# Split text from metadata
text_part = content.split("---\n## 本节参考文献")[0]
# Load cited papers from metadata comment
meta_match = re.search(r'<!-- metadata\n(.*?)\n-->', content, re.DOTALL)
local_papers = []
if meta_match:
meta_text = meta_match.group(1)
cp_match = re.search(r'cited_papers: (\[.*?\])', meta_text, re.DOTALL)
if cp_match:
local_papers = json.loads(cp_match.group(1))
if not local_papers:
full_text_parts.append(text_part)
continue
paper_by_local = {p["local_index"]: p for p in local_papers}
def remap(m):
nonlocal global_num
local_idx = int(m.group(1))
paper = paper_by_local.get(local_idx)
if not paper:
return ""
key = (paper.get("doi") or paper.get("pmid") or
paper.get("title", "").lower())
if key not in paper_to_global:
paper_to_global[key] = global_num
global_refs.append(paper)
global_num += 1
return f"[{paper_to_global[key]}]"
remapped = re.sub(r"\[(\d+)\]", remap, text_part)
full_text_parts.append(remapped)
# Build final GB/T 7714 reference list
from academic_literature_search import format_gbt7714 # conceptual import
ref_lines = [format_gbt7714(p, i) for i, p in enumerate(global_refs, 1)]
full_review = "\n\n".join(full_text_parts)
full_review += "\n\n---\n\n# 参考文献\n\n" + "\n".join(ref_lines)
with open(output_path, "w") as f:
f.write(full_review)
return len(global_refs)
Step F4: Quality Check (质量自检)
Run automated checks and produce a report:
质量自检报告
─────────────────────────────────
总节数: N
总唯一参考文献: M
每节平均引用: M/N = X.X (目标: 12-18)
─────────────────────────────────
引用覆盖率: Y% 段落有至少1个引用
最长无引用段: Z 字符 (目标: <500)
─────────────────────────────────
来源分布:
PubMed 期刊论文 [J]: X篇 (XX%)
预印本 [Z/OL]: Y篇 (YY%)
会议论文 [C]: Z篇 (ZZ%)
─────────────────────────────────
年份分布:
2024-2026: X篇 (XX%)
2021-2023: Y篇 (YY%)
2020及以前: Z篇 (ZZ%)
─────────────────────────────────
跨节引用复用: X篇被多节引用
无引用的节: [列表, 应为空]
─────────────────────────────────
节间衔接:
§1→§2: ✅ / ⚠️ [具体问题]
§2→§3: ✅ / ⚠️
...
─────────────────────────────────
与论文正文引用一致性:
正文引用在综述中也出现: X/Y篇
建议补引的正文参考文献: [列表]
Step F5: Write final files
output_dir/
综述_final.md ← 统一编号的完整综述
参考文献_final.md ← 独立的 GB/T 7714 参考文献列表
quality_report.md ← 质量自检报告
Interaction Protocol
Starting
User: "按大纲写综述" / "写综述"
- Read outline file
- Say: "我先做一轮侦察检索来验证大纲,然后给你修订建议。确认开始?"
- Execute Phase 0
- Present validation report (with sample paper titles)
- Wait for user to confirm
Per-section
- "大纲已确认。现在逐节写作,先从第1节开始?"
- Execute full pipeline for §1
- Show written section + local references + check results
- "第1节写完了。需要修改还是继续第2节?"
User commands (anytime)
| 用户说 | Agent 做 |
|---|---|
| "这节重写" | 重新执行当前节的 Step 1-6 |
| "多找几篇关于X的文献" | 追加检索,合入候选池 |
| "这篇一定要引:PMID/DOI" | MCP fetch → 强制纳入当前节 |
| "大纲要改" | 回到 Phase 0,仅重新验证受影响的节 |
| "跳过这节" | 标记 skipped,继续下一节 |
| "组装全文" | 跳到 Final Phase |
| "检查质量" | 对已完成的节运行 Step F4 质量自检 |
Writing Quality Standards (协和博士论文级别)
Language
- 学术中文,措辞严谨,避免口语化
- 英文术语首次出现标注中文翻译和英文缩写:检索增强生成(Retrieval-Augmented Generation, RAG)
- 后续直接使用缩写
- 刊名全称,全文统一:
Nature Medicine不用Nat Med
Narrative Structure (per section)
- 开门点题(1-2句):本节综述什么,为什么重要
- 发展脉络(2-3段):按时间或逻辑组织
- 现状分析(1-2段):主流方法/共识/争议
- 批判性评价(1段):局限和不足
- 收束引出(1-2句):引出下一节或研究空白
Citation Density
- 每个实质性段落至少 2-3 个引用
- 关键结论/数据/数字必须有引用
- 连续超过 3 句无引用 → 检查是否缺引
- 目标:每节 12-18 篇(生物节)/ 15-20 篇(CS/AI 节)
Forbidden Patterns
- ❌ 罗列式:"A研究了X[1]。B研究了Y[2]。C研究了Z[3]。"
- ✅ 叙事式:"多项研究从不同角度探讨了这一问题。A等[1]首先通过...揭示了..."
- ❌ 空泛评价:"取得了重要进展"、"具有广阔前景"
- ✅ 具体评价:"将检索精度从 70.1% 提升至 80.7%"、"覆盖了 338 个数据库"
- ❌ 刊名缩写(任何地方)
- ❌ 在未确认发表状态下将 arXiv 论文标为 [J]
Conference Paper Handling (§3, §4 重要)
Many key papers in AI/ML sections (ReAct, CoT, Reflexion, etc.) are published at conferences, not journals.
- If the paper is published at a conference (ICLR, NeurIPS, ICML, ACL, etc.): use
[C]format - If still only on arXiv without conference acceptance: use
[Z/OL]format - Agent MUST check: does this arXiv paper have a published venue? If yes, use
[C].
Common venues to check:
- ICLR, NeurIPS, ICML (machine learning)
- ACL, EMNLP, NAACL (NLP)
- SIGIR, CIKM (information retrieval)
- KDD, WWW (data mining / web)
- AAAI, IJCAI (general AI)
Recovery Protocol
If conversation is interrupted:
- Check for
_metadata.jsonin output directory - If exists: read it, report which sections are done, offer to continue from next section
- If not: start fresh from Phase 0
Every section is saved to file immediately — no work is lost on interruption.