name: thread-management description: Maintain email conversation context across messages using threading headers. Use when building thread reconstruction, linking replies to conversations, detecting thread hijacking, stripping quoted content, or providing thread context to AI agents. license: MIT
Thread Management
Keep email conversations connected - link replies to their threads, reconstruct conversation history, and detect when threads go wrong.
When to use this skill
- Building or debugging email thread reconstruction from headers
- Replies are not grouping into conversations correctly
- Implementing In-Reply-To and References headers on outbound replies
- Stripping quoted reply content to extract new message text
- Providing conversation context to AI agents or LLMs
- Detecting thread hijacking or forged thread injection attacks
- Threads display differently in Gmail vs Outlook and you need cross-client consistency
- Managing long-lived threads that span days or weeks
- Splitting or merging conversations programmatically
Related skills
inbound-processing- receiving and parsing incoming email (prerequisite for threading)reply-classification- classifying reply intent once you have thread contextemail-security- injection and phishing prevention, including thread hijackingtransactional-email- sending emails that need proper threading headers
How email threading works
Email threading is built on three RFC 5322 headers. Every email client uses some combination of these to decide which messages belong to the same conversation.
The three threading headers
Message-ID - a globally unique identifier for each email message. Generated by the sending mail server or client.
Message-ID: <abc123.456@mail.example.com>
Format: <local-part@domain>. The local part is typically a UUID, timestamp, or random string. The domain should be the sending server's hostname. Always enclosed in angle brackets.
In-Reply-To - contains the Message-ID of the message being replied to. Only present on replies, not on original messages.
In-Reply-To: <abc123.456@mail.example.com>
References - contains the Message-IDs of all ancestors in the thread, oldest first. Grows with each reply.
References: <abc123.456@mail.example.com>
<def789.012@reply.example.com>
<ghi345.678@reply2.example.com>
How threading builds up
Original message:
Message-ID: <msg-001@sender.com>
Subject: Q3 proposal
First reply:
Message-ID: <msg-002@recipient.com>
In-Reply-To: <msg-001@sender.com>
References: <msg-001@sender.com>
Subject: Re: Q3 proposal
Reply to the reply:
Message-ID: <msg-003@sender.com>
In-Reply-To: <msg-002@recipient.com>
References: <msg-001@sender.com> <msg-002@recipient.com>
Subject: Re: Q3 proposal
The References header creates a chain. Any client can reconstruct the full thread tree from it.
Generating correct Message-IDs
A Message-ID must be globally unique. Bad Message-IDs break threading because clients cannot distinguish messages.
Good patterns:
<{uuid}@{your-sending-domain}>
<{timestamp}.{random}@{your-sending-domain}>
Bad patterns:
<1@localhost> # Not unique
<msg@example.com> # Reused across messages
<no-angle-brackets> # Missing required angle brackets
Use your actual sending domain as the right-hand side. Some providers generate Message-IDs for you - if you need to reference them later (for threading inbound replies), store the provider-assigned Message-ID at send time.
Provider threading behavior
Gmail, Outlook, and other clients use different algorithms to group messages into conversations. If your emails thread correctly in one client but not another, this section explains why.
Gmail
Gmail uses a multi-factor algorithm combining headers and subject:
- References/In-Reply-To headers - primary signal. Gmail traverses the chain of Message-IDs.
- Subject line matching - secondary signal. Subjects must match after stripping
Re:,Fwd:, and similar prefixes. If the subject changes, Gmail breaks the thread. - Participants - minor signal. Influences grouping in ambiguous cases.
- Time proximity - prevents old unrelated emails with identical subjects from threading together.
Gmail-specific behaviors:
- Conversations max out at 100 messages. After that, a new thread starts automatically.
- Gmail assigns a
threadIdinternally. You can query it via the Gmail API (threads.list,threads.get), but it is user-specific - the same conversation has different thread IDs for different participants. - To add a message to an existing thread via the Gmail API, you must set the
threadIdon the message resource AND include correctReferencesandIn-Reply-Toheaders AND keep the subject matching.
Outlook / Microsoft 365
Outlook's conversation view groups primarily by subject line:
- Subject matching - the dominant factor. Outlook normalizes subjects aggressively (strips
RE:,FW:,AW:,SV:, etc. across languages). - References/In-Reply-To - used but weighted less heavily than in Gmail. Outlook may group messages with matching subjects even if References headers do not connect them.
- Time window - Outlook is more liberal with time gaps than Gmail.
Outlook-specific behaviors:
- Outlook may split a single thread into multiple conversations if replies arrive from different paths (e.g., forwarded chains).
- The "Conversation ID" (
ConversationIdin Microsoft Graph API) is assigned based on the subject hash and is shared across participants, unlike Gmail's per-user thread IDs. ConversationIndexis a binary header Outlook uses internally for tree ordering. Do not try to generate this yourself unless you are building Outlook integrations specifically.
Cross-client threading checklist
To thread correctly across both Gmail and Outlook:
- Always set
In-Reply-Toto the parent message'sMessage-ID - Always set
Referencesto the parent'sReferences+ parent'sMessage-ID - Keep the subject line consistent (prefix with
Re:only, do not modify the base subject) - Generate globally unique Message-IDs with your sending domain
- Store outbound Message-IDs so you can reference them when processing inbound replies
Thread reconstruction
When you receive inbound email and need to rebuild the conversation, you have two linking strategies.
Strategy 1: Header-based linking (high confidence)
Match the inbound message's In-Reply-To header against stored outbound Message-IDs.
Inbound arrives with:
In-Reply-To: <send-abc123@mail.yourapp.com>
Look up in your database:
SELECT request_id FROM send_attempts
WHERE provider_message_id = 'send-abc123@mail.yourapp.com'
This gives you a direct, high-confidence link (confidence = 1.0) back to the specific outbound message being replied to. From there, you can reconstruct the full thread.
If In-Reply-To does not match, fall back to parsing the References header. It contains all ancestor Message-IDs, so any match connects you to the thread.
Strategy 2: Heuristic linking (lower confidence)
When headers do not match (common when emails are forwarded, or the recipient's client strips headers), fall back to heuristics:
- Sender + recipient + time window: match the inbound sender against recent outbound recipients within the last 7 days
- Subject similarity: strip
Re:,Fwd:prefixes and compare normalized subjects - Domain matching: ensure the inbound sender's domain matches the outbound recipient's domain
Heuristic linking should be flagged as lower confidence (e.g., 0.5) so downstream systems can treat it accordingly.
// Example: heuristic fallback linking
type LinkResult = {
requestId: string;
method: 'in_reply_to' | 'references' | 'recipient_recent' | 'subject_match';
confidence: number; // 1.0 for header match, 0.3-0.7 for heuristics
};
What to store per message
For reliable thread reconstruction, persist these fields for every inbound and outbound message:
| Field | Why |
|---|---|
message_id | Your internal ID |
provider_message_id | The Message-ID assigned by the provider/MTA |
in_reply_to | The In-Reply-To header value |
references_header | The full References header (space-separated Message-IDs) |
thread_id | Your internal thread/conversation ID |
from_email | Sender address |
to_email | Recipient address |
subject | For fallback matching |
created_at | For time-window heuristics |
Quoted reply detection and stripping
When a reply arrives, the message body contains both the new content and quoted previous messages. Extracting just the new content is essential for AI agents, search indexing, and clean display.
Why this is hard
There is no standard for how email clients format quoted replies. Every client does it differently:
| Client | Plain text format | HTML format |
|---|---|---|
| Gmail | > prefix per line | <div class="gmail_quote"> wrapper |
| Outlook | > or full block | <div id="appendonsend"> or <!--[if gte mso 9]> |
| Apple Mail | > prefix | <blockquote type="cite"> |
| Thunderbird | > prefix | <blockquote> with cite attribute |
Plain text stripping
For plain text emails, detect quoted lines by these patterns:
Lines starting with ">"
Lines starting with "On <date>, <name> wrote:"
Lines starting with "From: " followed by header-like content
Lines matching "-----Original Message-----"
Lines matching "________________________________" (Outlook separator)
A basic approach: scan for the first line matching a reply header pattern, then treat everything from that point forward as quoted content.
const QUOTE_PATTERNS = [
/^>+ /m, // > prefix
/^On .+ wrote:$/m, // Gmail/Apple "On ... wrote:"
/^-{2,}\s*Original Message\s*-{2,}/im, // Outlook separator
/^_{10,}/m, // Outlook underscore line
/^From:\s.+/m, // Forwarded header block
/^Sent from my /m, // Mobile signatures (not quotes, but noise)
];
function extractNewContent(bodyText: string): string {
let cutPoint = bodyText.length;
for (const pattern of QUOTE_PATTERNS) {
const match = pattern.exec(bodyText);
if (match && match.index < cutPoint) {
cutPoint = match.index;
}
}
return bodyText.slice(0, cutPoint).trim();
}
HTML stripping
For HTML emails, target the wrapper elements:
const HTML_QUOTE_SELECTORS = [
'div.gmail_quote', // Gmail
'div.yahoo_quoted', // Yahoo
'blockquote[type="cite"]', // Apple Mail
'div#appendonsend', // Outlook web
'div.moz-cite-prefix', // Thunderbird
'div[id^="divRplyFwdMsg"]', // Outlook desktop
];
Remove elements matching these selectors from the parsed DOM. What remains is the new content.
Use a library
Do not build your own quoted content parser from scratch. The edge cases are numerous and provider-specific. Proven open-source options:
- email_reply_parser - GitHub's Ruby library, ported to multiple languages
- planer - Lever's JavaScript library (port of Mailgun's Talon)
- Talon - Mailgun's Python library for reply and signature detection
If you are building for production, start with one of these and customize for edge cases you encounter with your specific user base.
Thread context for AI agents
When feeding email threads to an AI agent or LLM, the way you structure thread context significantly affects response quality.
Build a thread timeline
Rather than dumping raw email content, build a structured timeline:
type ThreadEntry = {
direction: 'inbound' | 'outbound';
fromEmail: string;
timestamp: string;
newContent: string; // Quoted content stripped
intent?: string; // Classification result if available
metadata?: {
hasAttachments: boolean;
isForwarded: boolean;
confidence: number; // Link confidence
};
};
type ThreadContext = {
threadId: string;
subject: string;
participants: string[];
timeline: ThreadEntry[]; // Chronological order
totalMessages: number;
firstMessageAt: string;
lastMessageAt: string;
};
Context management for long threads
Long email threads create context window pressure. Strategies for managing this:
- Strip quoted content aggressively. Only include the new content from each message. Quoted replies are redundant when you have the full thread.
- Summarize older messages. For threads over 10 messages, summarize the first N messages into a brief paragraph and include only the most recent 5-7 messages in full.
- Include metadata, skip noise. Signatures, disclaimers, and legal footers add tokens without value. Strip them.
- Prioritize recent context. The last 2-3 messages are usually the most relevant for deciding what to do next.
Thread context enrichment
Beyond raw message content, enrich the thread context with:
- Contact history: previous threads with this person, total interactions, suppression status
- Intent trajectory: how the conversation intent has evolved (interested -> question -> interested)
- Active incidents: any open support tickets, complaints, or deliverability issues for this contact
- Suppression status: whether the contact is suppressed and why
type EnrichedThreadContext = {
timeline: ThreadEntry[];
suppressionStatus: { suppressed: boolean; reasonCode?: string };
activeIncidents: Array<{ id: string; type: string; status: string }>;
lastClassification: { intent: string; confidence: number } | null;
};
This enriched context lets AI agents make informed decisions - they know not to send a follow-up to a suppressed contact, or to escalate when there are open incidents.
Thread integrity and security
Email threads are a trust signal. When someone replies in an existing thread, users (and AI agents) assume continuity. Attackers exploit this.
Thread hijacking
Thread hijacking (also called reply chain attacks) is when an attacker inserts themselves into an existing conversation thread. This is more dangerous than regular phishing because:
- The message appears inside a trusted conversation
- Recipients have existing context and are primed to trust replies in the thread
- Traditional phishing indicators (unknown sender, suspicious subject) are absent
Attack methods:
- Compromised account: attacker gains access to one participant's mailbox and replies directly in the thread
- Header spoofing: attacker crafts an email with forged
In-Reply-ToandReferencesheaders pointing to a real thread, making their message appear in the conversation - Lookalike sender: attacker uses a domain similar to a participant's (e.g.,
examp1e.comvsexample.com) and replies with correct threading headers
Detecting thread anomalies
Monitor thread history for these suspicious patterns:
Forged thread injection - a new sender appears in a thread they have never participated in. Flag any message where the from_email has not been seen in the thread history.
Intent flip from new sender - a new participant arrives with an intent that conflicts with the conversation's established direction. For example, a thread where the contact has been "interested" suddenly gets a reply from a different sender with intent "objection" or "legal".
Rapid intent flip - the same thread switches between conflicting intents (e.g., "interested" to "objection") within a short time window (30 minutes or less). This can indicate a compromised account being used to derail a conversation.
type ThreadAnomaly = {
type: 'forged_thread_injection'
| 'intent_flip_different_sender'
| 'rapid_intent_flip';
detail: string;
};
type ThreadAnomalyResult = {
isAnomalous: boolean;
anomalies: ThreadAnomaly[];
severity: 'none' | 'warning' | 'critical';
};
Severity escalation:
- Warning: new sender in thread (could be a legitimate CC or forward)
- Critical: new sender with conflicting intent, or multiple anomalies detected simultaneously
Conflicting intent pairs
Not all intent changes are suspicious. "Interested" to "question" is natural. These pairs should be flagged as conflicts:
| Intent A | Intent B |
|---|---|
| interested | objection |
| interested | legal |
| interested | security |
| support | objection |
What to do when anomalies are detected
- Do not auto-respond. If the thread shows anomalies, hold the message for human review instead of letting an AI agent reply automatically.
- Check authentication. Verify SPF, DKIM, and DMARC results for the anomalous message. Forged threading with failed authentication is a strong indicator of an attack.
- Quarantine, do not reject. False positives happen (legitimate new participants get added to threads). Quarantine the message and flag it for review rather than silently dropping it.
- Log the anomaly. Record the anomaly type, severity, and details for security auditing and to improve detection over time.
Thread splitting and merging
Sometimes conversations need to be split into separate threads or merged together.
When to split
- A thread has diverged into two unrelated topics
- A support request contains multiple independent issues
- A reply introduces a new participant who should not see the full history
How to split
Create a new thread by sending a message with:
- A fresh
Message-ID - No
In-Reply-ToorReferencesheaders (or pointing only to the specific message you are branching from) - A modified subject line (e.g., changing "Re: Q3 proposal" to "Re: Q3 proposal - budget questions")
Gmail will break this into a new conversation because the subject changed. Outlook may still group it if the subject is close enough - changing the subject more significantly forces a split in both clients.
When to merge
- Duplicate threads about the same issue were created by different participants
- A forwarded message started a parallel thread that should be part of the original
How to merge
You cannot force email clients to merge threads client-side. But in your application:
- Assign the same internal
thread_idto messages from both conversations - When presenting thread context, combine the timelines and sort chronologically
- For AI agents, present the merged context as a single conversation
Common mistakes
-
Not storing outbound Message-IDs. If you do not persist the Message-ID from your sends, you cannot link inbound replies back to their thread. Store it at send time, not later.
-
Setting In-Reply-To but not References. Some clients (especially Outlook) rely on References for thread ordering. Always set both headers on replies.
-
Changing the subject line on replies. Adding "[TICKET-123]" or "[ACTION REQUIRED]" to the subject breaks threading in Gmail. Append tracking tokens to the body or use custom headers like
X-Thread-IDinstead. -
Treating all threads as linear. Email threads are trees, not lists. A single message can have multiple replies, creating branches. Your data model should support parent-child relationships, not just sequential ordering.
-
Trusting thread membership implicitly. A message appearing in a thread does not mean it is from a trusted sender. Validate the sender against the thread's participant history, especially before AI agents act on the content.
-
Building your own quoted reply parser. The edge cases across email clients, languages, and forwarding patterns are enormous. Use an established library (email_reply_parser, planer, talon) and customize from there.
-
Feeding full thread content to LLMs. Quoted replies duplicate content across messages. Strip quotes before building thread context, or you waste tokens on redundant content and risk confusing the model with conflicting versions.
-
Ignoring thread length limits. Gmail caps at 100 messages per thread. Your internal thread model should handle this gracefully - do not assume a thread ID maps to one conversation forever.
-
Using provider-specific thread IDs as your primary key. Gmail's
threadIdis per-user. Outlook'sConversationIdis subject-based. Neither is stable enough to be your canonical thread identifier. Generate your own internal thread IDs and map provider IDs to them. -
Not checking authentication on anomalous thread messages. A new sender in a thread who fails SPF/DKIM/DMARC is far more suspicious than one who passes. Always cross-reference thread anomalies with authentication results.
References
- RFC 5322 - Internet Message Format - defines Message-ID, In-Reply-To, and References headers
- RFC 5256 - IMAP SORT and THREAD Extensions - server-side threading algorithms (ORDEREDSUBJECT, REFERENCES)
- Gmail API: Manage threads - Gmail threading requirements and API
- Gmail: Group emails into conversations - Gmail conversation view behavior
- Microsoft Graph: ConversationId - Outlook threading via Microsoft Graph
- email_reply_parser (GitHub) - GitHub's reply content extraction library
- planer (Lever) - JavaScript library for stripping quoted replies
- SentinelOne: Email Reply Chain Attacks - thread hijacking attack patterns
- Palo Alto Unit 42: Emotet Thread Hijacking - real-world thread hijacking case study
- EmailEngine: Email Threading - practical guide to sending threaded emails via API