Quote Stripping
How AgentPost extracts clean reply text from email threads and when to use it
Quote Stripping
When someone replies to an email, most email clients append the full previous conversation below the new reply. This "quoted text" makes individual messages hard to parse programmatically -- especially for AI agents that need to understand only the new content.
AgentPost automatically strips quoted text and provides the clean reply in the extracted_text field on every message.
How It Works
When AgentPost processes an inbound email, it analyzes the message body to separate new content from quoted previous messages. The result is available in two fields:
| Field | Contains | Use Case |
|---|---|---|
text_body | Full message body including all quoted text | Archival, display, compliance |
html_body | Full HTML body including all quoted text | Rendering the complete email |
extracted_text | Only the new reply text, quotes stripped | AI agent processing, summarization |
For AI agent workflows, extracted_text is almost always what you want. It contains only the new information the sender added, without the noise of the entire conversation history.
Quote Detection Patterns
AgentPost recognizes quote markers from all major email clients:
Gmail
Gmail uses a <div class="gmail_quote"> wrapper in HTML and a line starting with On ... wrote: in plain text:
Thanks for the quick reply! The integration is working now.
On Wed, Mar 8, 2026 at 2:30 PM Support <[email protected]> wrote:
> Hi there! Try updating your webhook URL to use HTTPS.
> Let me know if that fixes it.extracted_text result: Thanks for the quick reply! The integration is working now.
Outlook
Outlook uses a separator line with dashes in plain text and a styled <div> in HTML:
Got it, I'll update the DNS records today.
-----Original Message-----
From: Support <[email protected]>
Sent: Wednesday, March 8, 2026 2:30 PM
To: [email protected]
Subject: Re: Domain Verification
Please add the following CNAME records to your DNS...extracted_text result: Got it, I'll update the DNS records today.
Apple Mail
Apple Mail uses > prefix characters for quoted text and a On ... wrote: header:
Perfect, that worked!
> On Mar 8, 2026, at 2:30 PM, Support <[email protected]> wrote:
>
> Try clearing your DNS cache and checking again in a few minutes.extracted_text result: Perfect, that worked!
Thunderbird
Thunderbird uses > prefixes similar to Apple Mail, with an On ... wrote: header:
I've updated the API key. Works now.
On 3/8/2026 2:30 PM, Support wrote:
> Your API key may have expired. Generate a new one from the
> dashboard and update your configuration.extracted_text result: I've updated the API key. Works now.
Edge Cases
Inline Replies
Some users reply inline within the quoted text rather than at the top. Inline replies are difficult to detect reliably because the new content is interspersed with quoted content:
> Can you send me the error log?
Here it is:
Error: Connection refused at port 5432
> Also, what version are you running?
Version 2.1.3In this case, extracted_text may include partial quoted text or miss some inline replies. For conversations where inline replies are common, consider using text_body and parsing it yourself.
Forwarded Messages
Forwarded messages include a forwarding header that AgentPost recognizes:
Check out this issue from a customer.
---------- Forwarded message ---------
From: [email protected]
Date: Wed, Mar 8, 2026
Subject: API Error
I'm getting a 500 error when creating an inbox...extracted_text result: Check out this issue from a customer.
Signatures
Email signatures are not stripped by quote detection. If the sender has a signature above their quoted text, it appears in extracted_text:
Thanks for the help!
--
Jane Smith
Engineering Lead, AcmeCoextracted_text result: Thanks for the help!\n\n--\nJane Smith\nEngineering Lead, AcmeCo
If you need to strip signatures, implement additional processing on the extracted_text value. Common patterns include detecting -- on its own line or using a signature detection library.
When to Use extracted_text vs Raw Body
| Scenario | Recommended Field |
|---|---|
| AI agent processing a support ticket | extracted_text |
| Generating a summary of a conversation | extracted_text |
| Displaying the full email in a web UI | html_body |
| Archiving emails for compliance | text_body |
| Searching email content | text_body |
| Routing emails based on reply content | extracted_text |
| Detecting if a reply contains an attachment | text_body + has_attachments |
Working with extracted_text
import AgentPost from '@agentpost/sdk';
const client = new AgentPost({ apiKey: 'ap_sk_live_your_key_here' });
// Webhook handler for incoming messages
app.post('/webhooks/agentpost', async (c) => {
const event = JSON.parse(await c.req.text());
if (event.type === 'message.received') {
const { extracted_text, text_body, message_id, inbox_id } = event.data;
// Use extracted_text for AI processing (clean reply only)
const reply = extracted_text || text_body;
const aiResponse = await yourAIAgent.process(reply);
await client.messages.reply(inbox_id, message_id, {
text_body: aiResponse,
});
}
return c.json({ received: true });
});Note: Always fall back to text_body when extracted_text is null. This happens when AgentPost cannot detect any quoted text (e.g., the first message in a thread, or an unusual email format).
Limitations
- Not 100% accurate: Quote detection uses heuristic pattern matching, not semantic understanding. Unusual email clients or heavily customized templates may not be detected correctly.
- No inline reply extraction: Inline replies mixed within quoted text are not separated out reliably.
- No signature stripping: Email signatures are included in
extracted_text. Implement your own signature detection if needed. - HTML-aware: Quote detection works on both HTML and plain text bodies. HTML detection is generally more reliable because email clients use consistent CSS classes for quoted content.