name: API Pagination Debugging description: "Systematic methodology for debugging pagination issues in API integrations, especially when switching between API versions or endpoints. Auto-activates when pagination stops early, returns duplicate results, or fails to iterate through complete datasets. Covers cursor-based vs page-based pagination, API response structure verification, and efficiency optimization. Trigger keywords: pagination bug, API not paginating, stuck at one page, cursor pagination, nextPageCursor, page-based pagination. (project)"
API Pagination Debugging
Purpose: Systematically diagnose and fix pagination failures that prevent complete data import from APIs
Core Principles
1. Verify API Response Structure Before Assuming
Never assume pagination fields based on documentation or other endpoints. Always test actual responses:
curl -s API_ENDPOINT | jq 'keys'
Different API versions or endpoints may use different pagination patterns even within the same service.
2. Match Pagination Logic to API Design
APIs use distinct pagination patterns that require different implementations:
- Cursor-based:
{nextPageCursor, results}- use cursor param - Page-based:
{page, total_pages, results}- use page number param - Offset-based:
{offset, limit, total}- use offset/limit params - Link-based:
{next, previous, results}- follow next URL
Using the wrong pattern causes pagination to stop after first page.
3. Optimize Page Size for Efficiency
Most APIs support configurable page sizes (e.g., 50-1000 items per page). Using maximum page_size:
- Reduces total API calls (20x fewer calls with 1000 vs 50)
- Decreases network overhead
- Minimizes rate limit exposure
- Speeds up bulk imports
4. Test Pagination Flow Before Implementation
Before implementing pagination logic:
- Fetch page 1 and inspect response structure
- Manually fetch page 2 to confirm field values
- Verify cursor/page advancement works correctly
- Check termination condition (null cursor, empty results, etc.)
Systematic Debugging Workflow
Step 1: Reproduce the Issue
Symptoms of pagination failure:
- Import stops after exactly 1 page
- Returns same results repeatedly
- Status shows "completed_all_pages" but dataset incomplete
- Missing data compared to known totals
Example:
Expected: 74,386 highlights
Actual: 463 files (< 1% of total)
Status: "completed_all_pages" after 1 page
Step 2: Inspect Actual API Response
Don't trust assumptions - verify response structure:
# Fetch first page and check structure
curl -s -H "Authorization: Token $TOKEN" \
"https://api.example.com/endpoint?page_size=50" | jq 'keys'
# Expected output reveals actual fields:
# ["count", "nextPageCursor", "results"]
# NOT ["count", "next", "previous", "results"]
Critical checks:
- What pagination fields exist?
- What are field names exactly? (case-sensitive)
- Are there any cursor/token fields?
- How does the API signal "no more pages"?
Step 3: Compare Expected vs Actual Fields
Common mismatches:
| Expected (Wrong) | Actual (Correct) | Impact |
|---|---|---|
next | nextPageCursor | Stops after page 1 |
page parameter | pageCursor parameter | Repeats page 1 |
| Page number increment | Cursor advancement | Never progresses |
has_more boolean | null cursor | Wrong termination check |
Step 4: Test Second Page Manually
Verify pagination actually works:
# Get page 1
PAGE1=$(curl -s -H "Authorization: Token $TOKEN" \
"https://api.example.com/endpoint?page_size=50")
# Extract cursor
CURSOR=$(echo $PAGE1 | jq -r '.nextPageCursor')
# Get page 2 using cursor
curl -s -H "Authorization: Token $TOKEN" \
"https://api.example.com/endpoint?page_size=50&pageCursor=$CURSOR" \
| jq '{count, nextPageCursor, results_count: (.results | length)}'
Expected results:
- Different
resultsarray contents - New
nextPageCursorvalue (or null if last page) - Progress toward completion
Step 5: Fix Pagination Logic
Update implementation to match API design:
For Cursor-Based Pagination
# Initialize
cursor = None
page_num = 0
while True:
page_num += 1
# Build params
params = {"page_size": 1000} # Use maximum
if cursor:
params["pageCursor"] = cursor # Use correct param name
# Fetch page
response = fetch_api(endpoint, params)
results = response.get("results", [])
if not results:
break # Empty results = done
# Process results
for item in results:
process(item)
# Get next cursor
next_cursor = response.get("nextPageCursor") # Use correct field name
if not next_cursor:
break # No more pages
cursor = next_cursor # Advance cursor
For Page-Based Pagination
# Initialize
page_num = 1
while True:
# Build params
params = {"page": page_num, "page_size": 1000}
# Fetch page
response = fetch_api(endpoint, params)
results = response.get("results", [])
if not results:
break
# Process results
for item in results:
process(item)
# Check if more pages exist
if not response.get("next"): # Or check page_num < total_pages
break
page_num += 1 # Increment page number
Step 6: Verify Fix with Logging
Add debug logging to confirm pagination works:
logger.info(f"Page {page_num}: {len(results)} items, cursor={cursor}, next={next_cursor}")
Expected log output:
Page 1: 1000 items, cursor=None, next=55771679
Page 2: 1000 items, cursor=55771679, next=55114962
Page 3: 1000 items, cursor=55114962, next=54503291
...
Page 75: 386 items, cursor=12847563, next=null
Step 7: Optimize Page Size
Before optimization:
params = {"page_size": 50} # Small pages
# Result: 1,488 pages needed for 74,386 items
After optimization:
params = {"page_size": 1000} # Maximum supported
# Result: 75 pages needed for 74,386 items
# Improvement: 20x fewer API calls
Check API documentation for:
- Maximum page_size allowed
- Rate limits (larger pages = fewer calls)
- Response time vs page size tradeoffs
✅ REQUIRED Patterns
DO: Test actual API responses before implementing
Never rely on documentation alone. Always curl the endpoint and inspect response structure:
curl -s API_ENDPOINT | jq '.'
DO: Use maximum page_size supported by API
Default page sizes are often inefficient (50-100 items). Check API limits and use maximum:
# Efficient
params = {"page_size": 1000}
# Inefficient
params = {"page_size": 50} # 20x more API calls
DO: Match parameter names exactly
API field names are case-sensitive and specific:
# CORRECT
params["pageCursor"] = cursor
# WRONG (will not work)
params["page_cursor"] = cursor # Snake case instead of camelCase
params["cursor"] = cursor # Missing "page" prefix
DO: Add pagination logging for diagnosis
Always log pagination progress:
logger.info(f"Page {page}: {len(results)} items, next={next_cursor}")
DO: Verify termination conditions
Check both conditions to prevent infinite loops:
# Check empty results
if not results:
break
# AND check next cursor/page
if not next_cursor: # or not has_more, or page >= total_pages
break
❌ FORBIDDEN Patterns
DON'T: Assume pagination pattern from other endpoints
Different endpoints in same API may use different pagination:
# WRONG: Assume v2 uses same pagination as v3
# v3 endpoint uses page numbers
# v2 endpoint uses cursors
DON'T: Check wrong field for continuation
# WRONG
if not data.get("next"): # Field doesn't exist
break
# RIGHT
if not data.get("nextPageCursor"): # Actual field name
break
DON'T: Use inefficient page sizes
# WRONG: Causes 20x more API calls
params = {"page_size": 50}
# RIGHT: Minimizes API calls
params = {"page_size": 1000}
DON'T: Increment page numbers for cursor-based APIs
# WRONG: Page number ignored for cursor-based pagination
page_num = 1
while True:
params = {"page": page_num} # Repeats page 1 forever
page_num += 1
# RIGHT: Use cursor advancement
cursor = None
while True:
params = {"pageCursor": cursor} if cursor else {}
cursor = response.get("nextPageCursor")
DON'T: Skip manual testing before implementation
# WRONG: Implement without verifying
# Assume API uses page numbers, implement pagination
# Deploy and discover it uses cursors
# RIGHT: Test first
# curl endpoint | jq 'keys'
# Verify field names
# Test page 2 manually
# Then implement
Quick Decision Tree
Is pagination working?
NO - stops after 1 page:
- Check actual API response structure (curl + jq)
- Compare field names (case-sensitive)
- Verify parameter names match API expectations
- Test page 2 manually
NO - returns duplicates:
- Check if using page number instead of cursor
- Verify cursor is advancing
- Check if parameter name is correct
YES - but slow:
- Check page_size value
- Increase to maximum supported
- Balance with rate limits
Which pagination pattern to use?
API returns nextPageCursor field:
→ Use cursor-based pagination with pageCursor parameter
API returns next URL:
→ Follow link-based pagination (use next URL directly)
API returns page and total_pages:
→ Use page-based pagination with page parameter
API returns offset and total:
→ Use offset-based pagination with offset and limit parameters
Common Mistakes
Mistake 1: Checking Non-Existent Field
Problem:
if not data.get("next"): # Field doesn't exist in response
break
Solution:
# First, check actual response
curl API | jq 'keys'
# Output: ["count", "nextPageCursor", "results"]
# Then use correct field
if not data.get("nextPageCursor"):
break
Mistake 2: Using Wrong Parameter Name
Problem:
params["page"] = page_num # API doesn't use page numbers
Solution:
# Cursor-based APIs require cursor parameter
params["pageCursor"] = cursor # Not "page"
Mistake 3: Small Page Size
Problem:
params = {"page_size": 50}
# 74,386 items ÷ 50 = 1,488 API calls
Solution:
params = {"page_size": 1000} # Use maximum
# 74,386 items ÷ 1000 = 75 API calls
# 20x improvement
Examples
Example 1: Readwise API Pagination Bug (January 2026)
Context:
- Readwise MCP server stuck importing 463 highlights instead of 74,386
- Status: "completed_all_pages" after 1 page
- Using v2 export API endpoint
❌ WRONG - Assumed page-based pagination
# Incorrect implementation
page_num = 1
while page_num < 1000:
params = {"page": page_num, "page_size": 50}
data = fetch_api("/export/", params, api_version="v2")
# Wrong field check
if not data.get("next"): # This field doesn't exist
break
page_num += 1 # Never executed because break on page 1
Problem: API uses cursor-based pagination, not page numbers. Field is nextPageCursor not next.
✅ RIGHT - Cursor-based pagination with correct fields
# Correct implementation
cursor = None
page_num = 0
while page_num < 1000:
page_num += 1
# Use cursor parameter
params = {"page_size": 1000} # Increased from 50
if cursor:
params["pageCursor"] = cursor # Correct parameter name
data = fetch_api("/export/", params, api_version="v2")
results = data.get("results", [])
if not results:
break
# Process results...
# Use correct field name
next_cursor = data.get("nextPageCursor") # Not "next"
if not next_cursor:
break
cursor = next_cursor # Advance cursor
Result:
- Before: 1 page, 463 highlights (< 1%)
- After: 75 pages, 74,386 highlights (100%)
- Efficiency: 20x fewer API calls (1000 vs 50 page_size)
Example 2: Debugging Unknown API Pagination
Context:
- New API integration
- Documentation unclear about pagination
- Need to import complete dataset
Step-by-step debugging:
# Step 1: Test API response structure
curl -s -H "Authorization: Token $TOKEN" \
"https://api.example.com/data?limit=10" | jq 'keys'
# Output: ["data", "pagination"]
# Step 2: Inspect pagination object
curl -s -H "Authorization: Token $TOKEN" \
"https://api.example.com/data?limit=10" | jq '.pagination'
# Output:
# {
# "total": 5000,
# "offset": 0,
# "limit": 10,
# "has_more": true
# }
# Step 3: Test offset advancement
curl -s -H "Authorization: Token $TOKEN" \
"https://api.example.com/data?limit=10&offset=10" | jq '.pagination'
# Output:
# {
# "total": 5000,
# "offset": 10,
# "limit": 10,
# "has_more": true
# }
Implementation:
# Offset-based pagination identified
offset = 0
limit = 100 # Use larger limit
while True:
params = {"limit": limit, "offset": offset}
response = fetch_api("/data", params)
items = response.get("data", [])
if not items:
break
# Process items...
pagination = response.get("pagination", {})
if not pagination.get("has_more"):
break
offset += limit # Advance offset
When to Use This Skill
This skill auto-activates when:
- Pagination stops after exactly 1 page despite more data existing
- Import status shows "completed_all_pages" but dataset incomplete
- API integration returns duplicate results repeatedly
- Implementing pagination for new API endpoint
- User mentions "pagination bug", "stuck at one page", or "not paginating"
- Debugging issues with cursor-based, page-based, or offset-based pagination
- Converting between pagination patterns (e.g., page numbers to cursors)
- Optimizing API call efficiency with page_size tuning
Don't use when:
- Pagination works correctly (complete dataset imported)
- API returns proper error messages (different debugging needed)
- Rate limiting is the issue (needs rate limit handling, not pagination fixes)
- Authentication problems (verify auth before debugging pagination)
Integration
Related Skills:
- Python Filename Sanitization Fallback - Related Readwise MCP pattern from same project
- API Endpoint Metadata Verification - Systematic debugging for missing API metadata
Related Commands:
/readwise-import- Primary user of this debugging methodology
Related Vault Documents:
- [[0 Projects/2026 Draft Articles/Readwise Highlights Import Draft]] - Documented implementation of highlights import with pagination
- [[Readwise MCP Server Implementation]] (if exists) - Technical documentation
Technical Context:
- MCP server:
/Users/ngpestelos/src/readwise-mcp-server/server.py - State file:
.claude/state/readwise-import.json - Readwise API docs: https://readwise.io/api_deets
Key Takeaway
API pagination failures usually stem from field name mismatches or wrong pagination pattern assumptions. Always verify actual API response structure with curl/jq before implementing pagination logic, use maximum page_size for efficiency, and test page 2 manually to confirm advancement works. The pattern is: inspect response → identify pagination type → match implementation → optimize page size → verify with logging.
Discovered January 30, 2026 during Readwise highlights backfill debugging Bug fix reduced 74,386 highlights import from theoretical 1,488 pages to actual 75 pages Pattern applies to any cursor-based, page-based, or offset-based pagination implementation