name: token-efficiency description: Reduce token waste by 40-60% through anti-sycophancy rules, tool-call budgets, one-pass coding, task profiles, and read-before-write enforcement. Inspired by drona23/claude-token-efficient.
Token Efficiency
Reduce output token waste and prevent iteration cycles that consume context.
Trigger
Use when:
- Sessions feel expensive or slow
- Output is verbose with filler text
- Claude is re-reading files or iterating unnecessarily
- Setting up a new project for token-efficient work
Anti-Sycophancy Rules
These patterns waste 30-60% of output tokens:
| Pattern | Example | Fix |
|---|---|---|
| Sycophantic opener | "Sure! Great question!" | Delete. Lead with answer. |
| Prompt restatement | "You're asking about X..." | Delete. Answer directly. |
| Closing fluff | "Let me know if you need anything!" | Delete. Stop after the answer. |
| Unsolicited suggestions | "You might also want to..." | Delete unless asked. |
| AI disclaimers | "As an AI model..." | Delete entirely. |
| Verbose preambles | "I'll help you with that..." | Delete. Start with the action. |
Tool-Call Budgets
Set explicit budgets by task complexity:
| Task Type | Tool-Call Budget | Wrap-Up At |
|---|---|---|
| Quick fix / lookup | 20 calls | 15 |
| Bug fix | 30 calls | 25 |
| Feature (small) | 50 calls | 40 |
| Feature (large) | 80 calls | 65 |
| Refactor | 50 calls | 40 |
| Exploration / research | 30 calls | 25 |
At the wrap-up threshold: commit progress, assess remaining work, decide whether to continue or start fresh.
One-Pass Coding Discipline
For simple-to-medium tasks:
- Read all relevant files including tests first
- Understand what tests assert before coding
- Write complete solution in one pass — not incrementally
- Run tests once — if pass, STOP immediately
- If fail: read the error, fix once, retest
- Never iterate more than twice on the same failure — rethink approach
- Never refactor, improve, or polish passing code
Task Profiles
Switch profiles based on what you're doing:
Coding Profile
- Return code first, explanation after (only if non-obvious)
- Simplest working solution, no over-engineering
- Read file before modifying — always
- No docstrings on unchanged code
- No error handling for impossible scenarios
- State bug, show fix, stop
Agent/Pipeline Profile
- Structured output only: JSON, bullets, tables
- No prose unless targeting a human reader
- Every output must be parseable without post-processing
- Execute task, do not narrate actions
- Never invent file paths, API endpoints, or function names
- If unknown: return null or "UNKNOWN", never guess
Analysis Profile
- Lead with finding, context and methodology after
- Tables and bullets over prose
- Numbers must include units
- Never fabricate data points
- Summary first (3 bullets max), caveats last
Read-Before-Write Enforcement
Hard rules:
- Never write a file you haven't read in this session
- Never re-read a file already read unless it was modified
- Read tests before coding — understand what passes before writing
- Read error output carefully before attempting a fix
ASCII-Only Output
Use ASCII characters only in all output:
--not—(em dash)"not""(smart quotes)'not''(curly apostrophes)- No emoji unless explicitly requested
- No Unicode decorators or special characters
This ensures clean copy-paste for code and compatibility with downstream systems.
Measuring Impact
Track these metrics to measure token savings:
- Output length: average words per response (target: 30-50% reduction)
- Tool calls per task: should stay within budget tier
- Re-read count: should be near zero
- Write-without-read count: should be zero
- Iteration cycles: tests should pass in 1-2 attempts, not 5+
Attribution
Token efficiency patterns adapted from drona23/claude-token-efficient (MIT).