Designing Better AI Chat: A Deep Dive (Part 2 of 2)
Citations, memory, thread navigation, export, and recovery patterns that turn AI chat from a demo into dependable infrastructure.

If Part I asked, “Do I understand what is happening and can I steer it?”, Part II asks the next question users carry after they close the tab:
Will this still make sense when I come back tomorrow?
And when something breaks, do I get a path forward?
This is the reliability layer of chat UX: evidence, memory controls, navigation for long threads, and recovery paths that keep work moving.
Part I covered the first layer. This article is the continuation.
1) Trust is not a vibe. It is evidence.
Confidence without inspectable proof feels like branding. Users do not need more text. They need a fast path from claim to source.
Pattern: Citations
Citations turn chat from monologue into something users can audit. Inline markers reduce blind trust and make review behavior more deliberate.
Design note: if you cannot cite a claim, say so explicitly.
→ Explore the Citations pattern
Pattern: Confidence Score
Confidence cues help users decide where to verify closely and where to move fast. Confidence UI fails when it never changes or never affects behavior.
→ Explore the Confidence Score pattern
For trust-sensitive products, pair these with Trust, sources & truthfulness and the Trust Stack article.
2) Memory is the difference between a session and a relationship
Stateful chat creates a new failure mode: the system remembers what users did not mean to teach it, or forgets what they assumed was stable.
Pattern: Memory Management
Treat memory as an explicit product surface: visible entries, provenance, edit/delete, and clear boundaries for scope and retention.
Design note: the win is negotiated recall, not perfect recall.
→ Explore the Memory Management pattern
See related patterns in Memory, personalization & data.
3) Long threads need navigation, not just scroll
Chat transcripts work until the task outlasts working memory. Then users need retrieval surfaces, not longer timelines.
Pattern: Conversation Summary
Summaries are recovery tools: return after a weekend, onboard a teammate, or re-anchor after a long branch. The best summaries invite correction.
→ Explore the Conversation Summary pattern
Pattern: Message Pinning
Pinning stabilizes long-running threads by preserving key instructions and decisions in a visible place. It lowers rediscovery cost for both users and teams.
→ Explore the Message Pinning pattern
4) Export and handoff are part of the product
High-stakes work rarely ends inside chat. If handoff is clumsy, users manually rebuild context in docs, tickets, and email.
Pattern: Error Recovery Strategies
Failure is inevitable. Good recovery gives users believable next steps: retry with context, narrow scope, or switch approach without losing the thread.
Error Recovery
ID: AGENT_091 LAT: 42MS VER: 2.1.0
Retry Maximum
Escalation Threshold
Fallback Strategy
Switch to static heuristics if logic fails.
Recovery Timeline
Real-time Stream
System initialized. Waiting for trigger...
→ Explore the Error Recovery Strategies pattern
Pattern: Human Handoff
Sometimes the best UX is not another model response. It is a clean escalation path with the right context and clear resolution ownership.
→ Explore the Human Handoff pattern
More in Errors & recovery.
5) The system layer: safety, limits, and honesty
Trust is also defined by what users see when things are constrained.
Pattern: Rate Limit Warnings
Name the constraint in plain language and offer a next step. A specific warning is more trustworthy than a generic failure state.
→ Explore the Rate Limit Warnings pattern
Pattern: Model Selection UI
When defaults fail, users need a deliberate tradeoff control: faster vs deeper, cheaper vs stronger, lighter vs more capable.
→ Explore the Model Selection UI pattern
Related hub: Cost, models & limits.
A quick audit you can run this week
For your current chat experience, ask:
- Can users verify claims quickly with inspectable sources?
- Can users see and control what the system remembers over time?
- Can users recover key moments in long threads without endless scrolling?
- When things fail, can users recover, hand off, or continue without starting over?
If those answers are mostly yes, chat stops feeling like a clever demo and starts feeling like dependable infrastructure.
In Part I, we covered legibility, explicit control, output shaping, and branching. Together these two layers separate chat that performs from chat people can trust.
Found this useful? Share it with your network.
Weekly insights in your inbox
A weekly newsletter for designers, PMs, and builders shipping AI products. Practical AI UX: patterns, real products, no hype.