Architecture

Architecture

Full-stack diagram

Your React app
Custom UI
Tailwind · shadcn/ui · any
<Chat />
MessageList · ChatInput · MarkdownRenderer
messages · loading · stop · sendMessage
useAIChat()
useSyncExternalStore · AbortController · lazy client init
Zustand store
messages[] · loading · error
SSE parser + normalizer
ReadableStream → StreamChunk
↕ HTTP POST · text/event-stream · AbortSignal ↕
Your server — any language, any framework
/api/chat
Next.js · Express · FastAPI · Go · Rails · Cloudflare Workers
↕ provider API calls ↕
LLM Providers
Anthropic
Claude 3.5 · 3 · Haiku
OpenAI
GPT-4o · o1 · mini
Groq
Llama · Mixtral · Gemma
Custom / Local
Ollama · vLLM · any endpoint

The key insight: your React components never know which LLM produced the stream. The hook speaks a three-event protocol (text, done, error) over SSE. Any server that produces those events works — regardless of language, cloud, or LLM provider.


Provider abstraction

Browser
useAIChat()
React hook
MessageStore
messages · loading · error
AIClient
fetch + SSE reader
CustomProvider
/api/chat endpoint
OpenAIProvider
direct (dev only)
AnthropicProvider
direct (dev only)
↕ HTTP POST · SSE stream · AbortSignal ↕
Your server (any stack)
/api/chat
Next.js · Express · FastAPI · Go · Rails
Any LLM
Anthropic · OpenAI · Groq · Mistral · local

CustomProvider (your /api/chat endpoint) is the recommended production path — your API key stays on the server, provider routing is your business logic, and the React layer is untouched when you switch models.

OpenAIProvider and AnthropicProvider call the vendor API directly from the browser. Safe for local prototypes, not for production.


Streaming pipeline

1
sendMessage("Hello")
User message added · loading → true · POST opens
2
data: {"type":"text","text":"Hi"}
onToken fires · content grows in real time
3
data: {"type":"text","text":"!"}
onToken fires · React re-renders once per chunk
4
data: {"type":"done"}
onComplete(finalMessage) fires · loading → false
5
stop() [optional]
AbortController.abort() · partial content preserved in messages

See Streaming lifecycle for full details on callbacks and abort behavior.


Packages

@react-ai-stream/core

No React dependency. Runs in Node.js, Deno, Bun, or any JS environment.

  • SSE parser — converts ReadableStream<Uint8Array> to AsyncIterable<string>, handles buffering on \n\n boundaries
  • Chunk normalizer — maps provider-specific event shapes to StreamChunk ({type, text})
  • ProvidersOpenAIProvider, AnthropicProvider, CustomProvider
  • Message store — Zustand store factory; createMessageStore() returns a fully isolated store per call
  • Abort utilities — thin wrapper around AbortController with isAbortError guard

@react-ai-stream/react

Depends on @react-ai-stream/core and React.

  • useAIChat — subscribes to the message store via useSyncExternalStore; orchestrates the streaming lifecycle; resets the client when endpoint, provider, apiKey, or the context client changes
  • AIChatProvider — React context for sharing a pre-built AIClient across a subtree
  • useStableCallback — stable function reference that always calls the latest closure, used internally to prevent stale closures in async stream loops

@react-ai-stream/ui

Depends on @react-ai-stream/react. Completely optional.

  • Chat — all-in-one: MessageList + ChatInput
  • MessageList — renders messages with typing indicator; auto-scrolls on messages or loading change
  • MessageBubble — individual message with role-based styling
  • ChatInput — auto-resizing textarea with send/stop
  • MarkdownRenderer — GFM via react-markdown + rehype-highlight; copy button reads textContent to avoid [object Object] from syntax-highlighted children

State management

Each useAIChat instance owns a Zustand store created by createMessageStore():

{
  messages: Message[]
  loading: boolean
  error: string | null
  abortController: AbortController | null
}

useSyncExternalStore subscribes to this store. Only the component that called useAIChat re-renders when its store updates — no context propagation, no unnecessary renders in siblings. Three hook instances → three completely isolated stores.

Why Zustand instead of useReducer

Zustand's createStore (vanilla, no React) lets the store live outside the React tree. This means:

  • Store can be created without a component render
  • Multiple components can subscribe without a shared context
  • Store lifecycle is tied to the hook ref, not to a provider tree

This is what enables truly isolated chat instances without any wrapping <Provider>.


Abort semantics

Every sendMessage call creates a new AbortController. The signal is passed to fetch. When stop() is called or the component unmounts, abort() fires:

user calls stop()
  → abortController.abort()
    → fetch rejects with AbortError
      → stream loop catches isAbortError() → true
        → loading → false (no error surfaced)
          → partial response preserved in messages

On the server side, the request's signal becomes aborted too (req.signal in Next.js edge/Node.js handlers). Passing it to the upstream fetch cancels the LLM call, avoiding wasted token generation:

const upstream = await fetch(LLM_URL, {
  signal: req.signal,  // forward the abort
  ...
})

SSE buffering

Network chunks don't align with SSE event boundaries. A single reader.read() call may return half an event or three events concatenated. The parser buffers and splits correctly:

let buf = ''
while (true) {
  const { done, value } = await reader.read()
  if (done) break
  buf += decoder.decode(value, { stream: true })
  const parts = buf.split('\n\n')
  buf = parts.pop() ?? ''   // keep the incomplete tail
  for (const part of parts) {
    // process complete events
  }
}

The critical invariant: buf = parts.pop() always preserves the incomplete trailing event. Setting buf = '' inside the loop (a common mistake) silently drops buffered content mid-chunk.