Python (FastAPI / Starlette)
The rais Python package provides a stream_response() async generator that yields RAIS-compliant SSE strings. Use it with FastAPI's StreamingResponse, Starlette, or any async Python framework.
Install
pip install rais[openai]FastAPI example
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from rais import stream_response
app = FastAPI()
app.add_middleware(CORSMiddleware, allow_origins=["*"], allow_methods=["*"], allow_headers=["*"])
class Message(BaseModel):
role: str
content: str
class ChatRequest(BaseModel):
messages: list[Message]
@app.post("/api/chat")
async def chat(request: ChatRequest):
return StreamingResponse(
stream_response(
[m.model_dump() for m in request.messages],
provider="openai", # or "anthropic"
),
media_type="text/event-stream",
headers={"Cache-Control": "no-cache", "Connection": "keep-alive"},
)Run with: uvicorn main:app --reload
API
stream_response(messages, *, provider, ...)
Async generator that yields data: {...}\n\n strings.
| Parameter | Type | Description |
|---|---|---|
messages | list[dict] | Conversation history — [{"role": ..., "content": ...}] |
provider | "openai" | "anthropic" | Which LLM provider to use |
api_key | str | None | API key. Falls back to OPENAI_API_KEY / ANTHROPIC_API_KEY env var |
model | str | None | Model name. Defaults to gpt-4o-mini / claude-sonnet-4-6 |
max_tokens | int | Max tokens to generate (Anthropic only, default 1024) |
system | str | None | System prompt injected before the conversation |
rais_event(data)
Low-level helper — encode any dict as a RAIS SSE line:
from rais import rais_event
rais_event({"type": "text", "text": "Hello"})
# → 'data: {"type": "text", "text": "Hello"}\n\n'Use this when building your own streaming logic without stream_response.
Custom streaming logic
For full control — local models, LangChain, LlamaIndex — implement the generator yourself using rais_event:
import asyncio
from rais import rais_event
async def my_stream(messages):
# Your logic here — local model, LangChain, etc.
for token in generate_tokens(messages):
yield rais_event({"type": "text", "text": token})
await asyncio.sleep(0)
yield rais_event({"type": "done"})
@app.post("/api/chat")
async def chat(request: ChatRequest):
return StreamingResponse(
my_stream([m.model_dump() for m in request.messages]),
media_type="text/event-stream",
)Client
The React and Vue hooks work unchanged:
const chat = useAIChat({ endpoint: 'http://localhost:8000/api/chat' })Run FastAPI on port 8000 and your React app on port 3000 — add the CORS middleware as shown above.