AI Agent Systems Unknown Unknowns Checklist

This is the highest-density area of unknown unknowns because AI agent systems are new enough that most developers have never encountered these failure modes.

Prompt Injection & Security

Direct prompt injection — users craft inputs that override system prompts
Indirect prompt injection — malicious content in documents/websites the agent reads
Tool abuse — agent tricked into calling tools with malicious parameters
Privilege escalation — agent accessing resources beyond intended scope
Data exfiltration via prompt injection — agent tricked into sending data to attacker
System prompt extraction — users can often get the agent to reveal its instructions
Jailbreaking — circumventing safety guidelines through creative prompting
Multi-step attacks — benign individual steps that combine into harmful actions

Cost & Resource Control

Token costs can explode with agent loops (agent retrying or thinking in circles)
No per-user or per-request cost caps → one user can drain your budget
Long conversations accumulate context → each message costs more than the last
Tool calls multiply costs (each tool call is a separate API call)
Retry loops without backoff or max attempts
Parallel agent execution without concurrency limits
Large document processing (PDFs, codebases) consuming massive token counts
No monitoring of per-user or per-session cost

Reliability & Failure Modes

LLMs are non-deterministic — same input can produce different outputs
Hallucinated tool calls — agent invents tools or parameters that don't exist
Hallucinated data — agent confidently presents fabricated information
Cascading hallucinations — one agent's hallucination feeds into another's input
Infinite loops — agent gets stuck in a retry or reasoning cycle
Partial failures — agent completes some steps but fails silently on others
Context window overflow — conversation exceeds model limits, drops information
Model degradation — API provider changes model behavior without notice
Rate limiting from API providers during high usage

Multi-Agent Specific

No coordination protocol — agents working at cross purposes
Deadlocks — agents waiting on each other
Race conditions — multiple agents modifying the same resource
Message ordering not guaranteed in async agent communication
No shared state management — agents have inconsistent views of the world
Error propagation — one agent's failure cascades through the system
No supervisor/orchestrator — no way to detect when the system is stuck
Trust boundaries — should Agent A trust Agent B's output?
Observability — hard to debug what happened across multiple agents
Testing — traditional unit tests don't cover non-deterministic behavior

Data & Privacy

User data sent to LLM APIs — check provider's data retention policies
PII in prompts — sensitive data exposed to third-party AI providers
Conversation logs containing sensitive information
Agent memory/context containing data from other users (multi-tenant leaks)
Training data inclusion — some providers may train on your data
GDPR right to deletion includes AI conversation data
Vector database containing embeddings of sensitive documents

UX & Trust

Users don't understand what the AI can and can't do
No confidence indicators — users can't tell when the AI is guessing
Users over-trust AI output (especially in domains like medical, legal, financial)
No human-in-the-loop for high-stakes decisions
Missing audit trail — can't explain why the AI did something
Inconsistent behavior confuses users (different answers to same question)
Latency expectations — AI operations are slower than traditional CRUD
Error messages from AI failures are often cryptic to users
No feedback mechanism — users can't report bad AI behavior

Evaluation & Quality

No evaluation framework — no way to measure if the AI is getting better or worse
No regression testing for prompt changes
A/B testing is complex with non-deterministic outputs
Ground truth is expensive to establish for open-ended tasks
Prompt drift — system prompt accumulates changes that interact unpredictably
Model version changes break carefully tuned prompts
No monitoring of output quality over time
Bias in AI outputs not measured or mitigated

Architecture Patterns

Putting too much logic in prompts vs code (prompts are fragile, code is reliable)
Not separating AI decision-making from action execution
Missing fallback for when AI is unavailable or degraded
No caching of common AI responses (huge cost and latency savings)
Streaming responses not implemented (users staring at blank screen)
No graceful degradation — if AI fails, entire feature fails
Tight coupling to specific AI provider — no abstraction layer
Missing structured output validation — AI returns unexpected formats
No content filtering on AI outputs before showing to users

ナビゲーション

Skillsとは？

リンク

AI Agent Systems Unknown Unknowns Checklist

AI Agent Systems Unknown Unknowns Checklist

Prompt Injection & Security

Cost & Resource Control

Reliability & Failure Modes

Multi-Agent Specific

Data & Privacy

UX & Trust

Evaluation & Quality

Architecture Patterns

関連スキル(🤖 AI・機械学習)