Architecture

Full-stack diagram

Your React app

Custom UI

Tailwind · shadcn/ui · any

MessageList · ChatInput · MarkdownRenderer

messages · loading · stop · sendMessage

useAIChat()

useSyncExternalStore · AbortController · lazy client init

Zustand store

messages[] · loading · error

SSE parser + normalizer

ReadableStream → StreamChunk

↕  HTTP POST  ·  text/event-stream  ·  AbortSignal  ↕

Your server — any language, any framework

/api/chat

Next.js · Express · FastAPI · Go · Rails · Cloudflare Workers

↕  provider API calls  ↕

LLM Providers

Anthropic

Claude 3.5 · 3 · Haiku

OpenAI

GPT-4o · o1 · mini

Groq

Llama · Mixtral · Gemma

Custom / Local

Ollama · vLLM · any endpoint

The key insight: your React components never know which LLM produced the stream. The hook speaks a three-event protocol (text, done, error) over SSE. Any server that produces those events works — regardless of language, cloud, or LLM provider.

Provider abstraction

Browser

useAIChat()

React hook

MessageStore

messages · loading · error

AIClient

fetch + SSE reader

CustomProvider

/api/chat endpoint

OpenAIProvider

direct (dev only)

AnthropicProvider

direct (dev only)

↕  HTTP POST · SSE stream · AbortSignal  ↕

Your server (any stack)

/api/chat

Next.js · Express · FastAPI · Go · Rails

Any LLM

Anthropic · OpenAI · Groq · Mistral · local

CustomProvider (your /api/chat endpoint) is the recommended production path — your API key stays on the server, provider routing is your business logic, and the React layer is untouched when you switch models.

OpenAIProvider and AnthropicProvider call the vendor API directly from the browser. Safe for local prototypes, not for production.

Streaming pipeline

sendMessage("Hello")

User message added · loading → true · POST opens

data: {"type":"text","text":"Hi"}

onToken fires · content grows in real time

data: {"type":"text","text":"!"}

onToken fires · React re-renders once per chunk

data: {"type":"done"}

onComplete(finalMessage) fires · loading → false

stop() [optional]

AbortController.abort() · partial content preserved in messages

See Streaming lifecycle for full details on callbacks and abort behavior.

Packages

`@react-ai-stream/core`

No React dependency. Runs in Node.js, Deno, Bun, or any JS environment.

SSE parser — converts ReadableStream<Uint8Array> to AsyncIterable<string>, handles buffering on \n\n boundaries
Chunk normalizer — maps provider-specific event shapes to StreamChunk ({type, text})
Providers — OpenAIProvider, AnthropicProvider, CustomProvider
Message store — Zustand store factory; createMessageStore() returns a fully isolated store per call
Abort utilities — thin wrapper around AbortController with isAbortError guard

`@react-ai-stream/react`

Depends on @react-ai-stream/core and React.

useAIChat — subscribes to the message store via useSyncExternalStore; orchestrates the streaming lifecycle; resets the client when endpoint, provider, apiKey, or the context client changes
AIChatProvider — React context for sharing a pre-built AIClient across a subtree
useStableCallback — stable function reference that always calls the latest closure, used internally to prevent stale closures in async stream loops

`@react-ai-stream/ui`

Depends on @react-ai-stream/react. Completely optional.

Chat — all-in-one: MessageList + ChatInput
MessageList — renders messages with typing indicator; auto-scrolls on messages or loading change
MessageBubble — individual message with role-based styling
ChatInput — auto-resizing textarea with send/stop
MarkdownRenderer — GFM via react-markdown + rehype-highlight; copy button reads textContent to avoid [object Object] from syntax-highlighted children

State management

Each useAIChat instance owns a Zustand store created by createMessageStore():

{
  messages: Message[]
  loading: boolean
  error: string | null
  abortController: AbortController | null
}

useSyncExternalStore subscribes to this store. Only the component that called useAIChat re-renders when its store updates — no context propagation, no unnecessary renders in siblings. Three hook instances → three completely isolated stores.

Why Zustand instead of `useReducer`

Zustand's createStore (vanilla, no React) lets the store live outside the React tree. This means:

Store can be created without a component render
Multiple components can subscribe without a shared context
Store lifecycle is tied to the hook ref, not to a provider tree

This is what enables truly isolated chat instances without any wrapping <Provider>.

Abort semantics

Every sendMessage call creates a new AbortController. The signal is passed to fetch. When stop() is called or the component unmounts, abort() fires:

user calls stop()
  → abortController.abort()
    → fetch rejects with AbortError
      → stream loop catches isAbortError() → true
        → loading → false (no error surfaced)
          → partial response preserved in messages

On the server side, the request's signal becomes aborted too (req.signal in Next.js edge/Node.js handlers). Passing it to the upstream fetch cancels the LLM call, avoiding wasted token generation:

const upstream = await fetch(LLM_URL, {
  signal: req.signal,  // forward the abort
  ...
})

SSE buffering

Network chunks don't align with SSE event boundaries. A single reader.read() call may return half an event or three events concatenated. The parser buffers and splits correctly:

let buf = ''
while (true) {
  const { done, value } = await reader.read()
  if (done) break
  buf += decoder.decode(value, { stream: true })
  const parts = buf.split('\n\n')
  buf = parts.pop() ?? ''   // keep the incomplete tail
  for (const part of parts) {
    // process complete events
  }
}

The critical invariant: buf = parts.pop() always preserves the incomplete trailing event. Setting buf = '' inside the loop (a common mistake) silently drops buffered content mid-chunk.

Types Stats