Most AI chat implementations suffer from long wait times. Users stare at loading spinners while the entire response generates. What if you could stream responses word-by-word, like ChatGPT?
In this tutorial, you'll learn how to use Graphlit's streamAgent
to build a real-time streaming chat interface.
What You'll Build
By the end of this tutorial, you'll have:
- ✅ AI specification - Model configuration (GPT-4, Claude, Gemini, etc.)
- ✅ Conversation management - Multi-turn chat with context
- ✅ Real-time streaming - Word-by-word token display
- ✅ Next.js API route - Server-side streaming endpoint
- ✅ React component - Complete chat UI with streaming
Why streamAgent?
The streamAgent
method provides:
- Real-time token streaming - Display text as it generates
- Built-in conversation management - Maintains chat history automatically
- Automatic context - Includes data from your indexed sources
- Tool calling support - AI can execute functions
- Token metrics - Track usage and throughput
Let's build it.
Prerequisites
Install the required packages:
npm install graphlit-client
npm install @microsoft/fetch-event-source # Enhanced SSE client
Why @microsoft/fetch-event-source?
The native browser EventSource
API doesn't support POST requests or custom headers. This library provides:
- POST request support (needed to send chat data)
- Custom headers (for authentication)
- Better error handling and retry logic
Set up your environment variables:
GRAPHLIT_ORGANIZATION_ID=your_org_id
GRAPHLIT_ENVIRONMENT_ID=your_env_id
GRAPHLIT_JWT_SECRET=your_jwt_secret
Authentication Model:
Graphlit uses JWT-based authentication with three credentials:
- Organization ID: Your top-level account
- Environment ID: Your workspace (dev, staging, prod)
- JWT Secret: Used to sign tokens for API access
Get these from portal.graphlit.dev → Settings → Credentials.
Note: The SDK handles token generation automatically - you just provide the secret.
Step 1: Create a Specification
A specification defines your AI model configuration - which model to use, system prompt, temperature, etc.
import { Graphlit, Types } from 'graphlit-client';
const client = new Graphlit(
process.env.GRAPHLIT_ORGANIZATION_ID!,
process.env.GRAPHLIT_ENVIRONMENT_ID!,
process.env.GRAPHLIT_JWT_SECRET!
);
// Create a specification for GPT-4.1 Mini
const specificationInput: Types.SpecificationInput = {
type: Types.SpecificationTypes.Completion,
serviceType: Types.ModelServiceTypes.OpenAi,
name: 'GPT-4.1 Mini Chat',
systemPrompt: 'You are a helpful AI assistant with access to the user\'s knowledge base.',
openAI: {
model: Types.OpenAiModels.Gpt41Mini_1024K,
temperature: 0.5, // 0.0-1.0: Lower = focused, Higher = creative
},
};
const result = await client.createSpecification(specificationInput);
const specificationId = result.createSpecification?.id;
console.log('Created specification:', specificationId);
Other Model Examples
Anthropic Claude:
{
type: Types.SpecificationTypes.Completion,
serviceType: Types.ModelServiceTypes.Anthropic,
name: 'Claude 4.5 Sonnet Chat',
systemPrompt: 'You are a helpful AI assistant.',
anthropic: {
model: Types.AnthropicModels.Claude_4_5Sonnet,
temperature: 0.7,
},
}
Google Gemini:
{
type: Types.SpecificationTypes.Completion,
serviceType: Types.ModelServiceTypes.Google,
name: 'Gemini 2.5 Flash Chat',
systemPrompt: 'You are a helpful AI assistant.',
google: {
model: Types.GoogleModels.Gemini_2_5FlashThinking,
temperature: 0.5,
},
}
xAI Grok:
{
type: Types.SpecificationTypes.Completion,
serviceType: Types.ModelServiceTypes.Xai,
name: 'Grok 4 Chat',
systemPrompt: 'You are a helpful AI assistant.',
xai: {
model: Types.XaiModels.Grok_4,
temperature: 0.5,
},
}
Temperature Parameter:
The temperature
setting (0.0-1.0) controls response randomness:
- Lower (0.0-0.3): More focused, deterministic responses
- Medium (0.4-0.7): Balanced creativity
- Higher (0.8-1.0): More creative, varied responses
Step 2: Create a Conversation
A conversation maintains the chat history and context. You can reuse it across multiple messages.
const conversationInput: Types.ConversationInput = {
name: 'My Chat Session',
specification: { id: specificationId },
type: Types.ConversationTypes.Content,
};
const conversationResult = await client.createConversation(conversationInput);
const conversationId = conversationResult.createConversation?.id;
console.log('Created conversation:', conversationId);
Optional: If you don't provide a conversationId
to streamAgent
, it will create one automatically.
Understanding Context
Graphlit can ingest and index content from multiple sources:
- Documents (PDFs, Word, etc.)
- Web pages
- Email and Slack messages
- GitHub repositories
- And 30+ other sources
When you use streamAgent
, the AI automatically searches this indexed content to find relevant information for your query. You don't need to manually pass context - Graphlit handles semantic search and retrieval automatically.
Example: If you've indexed your company docs and ask "What's our refund policy?", the AI will find and reference the relevant documentation automatically.
Step 3: Understanding streamAgent
The streamAgent
method signature:
await client.streamAgent(
prompt: string, // User's message
onEvent: (event: AgentStreamEvent) => void, // Event handler
conversationId?: string, // Optional conversation ID
specification?: { id: string }, // Optional spec override
tools?: Types.ToolDefinition[], // Optional tool definitions
toolHandler?: (toolCall: ToolCall) => any, // Optional tool executor
options?: StreamOptions // Optional streaming options
);
Event Types
streamAgent
emits these events:
Event Flow
Here's what happens when you send a message:
conversation_started
- Fires once when conversation beginsmessage_update
- Fires repeatedly (10-100+ times) as each token/word generatesmessage_completed
- Fires once when response is done
Optional (if tools are used):
tool_call_started
- AI decided to use a tooltool_call_completed
- Tool execution finished, AI continues with result
The message_update
event fires frequently - potentially every few milliseconds as tokens stream in. This is what creates the "typing" effect.
Why Server-Sent Events (SSE)?
Streaming responses require a way to push data from server to client in real-time. We use Server-Sent Events (SSE) because:
- One-directional: Server pushes to client (perfect for AI responses)
- Built-in reconnection: Automatically reconnects if connection drops
- Text-based: Simple to implement, works over HTTP
- Browser native: No WebSocket complexity needed
SSE works over a regular HTTP connection but keeps it open, allowing the server to push multiple events over time.
Basic Example
const prompt = "Explain quantum computing in simple terms";
await client.streamAgent(
prompt,
async (event) => {
switch (event.type) {
case 'conversation_started':
console.log('📝 Started:', event.conversationId);
break;
case 'message_update':
// This fires multiple times as tokens stream in
console.log('💬 Content:', event.message.message);
console.log('📊 Tokens:', event.message.tokens);
console.log('⚡ Speed:', event.message.throughput, 'tokens/sec');
break;
case 'message_completed':
console.log('✅ Completed');
console.log('📝 Final:', event.message.message);
console.log('💰 Tokens:', event.message.tokens);
console.log('⏱️ Time:', event.message.completionTime, 'sec');
break;
case 'error':
console.error('❌ Error:', event.error);
break;
}
},
conversationId, // Optional - pass existing conversation
{ id: specificationId } // Optional - override specification
);
Step 4: Next.js API Route (Server-Side)
Create a server-side endpoint that streams responses using Server-Sent Events (SSE).
This route uses Node.js ReadableStream to create a stream of Server-Sent Events. The stream stays open, pushing events as they arrive from Graphlit's streamAgent
.
Key parts:
ReadableStream
- Creates a stream that can push data over timecontroller.enqueue()
- Pushes a new event to the clientcontroller.close()
- Closes the stream when done
File: app/api/chat/route.ts
import { NextRequest } from 'next/server';
import { Graphlit } from 'graphlit-client';
export const runtime = 'nodejs';
export async function POST(req: NextRequest) {
try {
const { prompt, conversationId, specificationId } = await req.json();
const client = new Graphlit(
process.env.GRAPHLIT_ORGANIZATION_ID!,
process.env.GRAPHLIT_ENVIRONMENT_ID!,
process.env.GRAPHLIT_JWT_SECRET!
);
// Create an SSE stream
const encoder = new TextEncoder();
const stream = new ReadableStream({
async start(controller) {
try {
await client.streamAgent(
prompt,
async (event) => {
// Send each event as Server-Sent Event
const data = `event: ${event.type}\ndata: ${JSON.stringify(event)}\n\n`;
controller.enqueue(encoder.encode(data));
},
conversationId,
specificationId ? { id: specificationId } : undefined
);
// Close the stream when done
controller.close();
} catch (error) {
// Send error event
const errorData = `event: error\ndata: ${JSON.stringify({
error: error instanceof Error ? error.message : String(error)
})}\n\n`;
controller.enqueue(encoder.encode(errorData));
controller.close();
}
},
});
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
},
});
} catch (error) {
return new Response(
JSON.stringify({
error: error instanceof Error ? error.message : String(error)
}),
{
status: 500,
headers: { 'Content-Type': 'application/json' }
}
);
}
}
Step 5: React Component (Client-Side)
Build a chat interface that connects to your streaming endpoint.
File: components/StreamingChat.tsx
'use client';
import { useState, useRef } from 'react';
import { fetchEventSource } from '@microsoft/fetch-event-source';
interface Message {
role: 'user' | 'assistant';
content: string;
isStreaming?: boolean;
tokens?: number;
throughput?: number;
}
export function StreamingChat({ specificationId }: { specificationId: string }) {
const [messages, setMessages] = useState<Message[]>([]);
const [input, setInput] = useState('');
const [isSubmitting, setIsSubmitting] = useState(false);
const [conversationId, setConversationId] = useState<string>();
const abortController = useRef<AbortController>();
const sendMessage = async () => {
if (!input.trim() || isSubmitting) return;
// Add user message
const userMessage: Message = { role: 'user', content: input };
setMessages((prev) => [...prev, userMessage]);
const userPrompt = input;
setInput('');
setIsSubmitting(true);
// Create abort controller for this request
// This lets us cancel the stream if user clicks "Stop"
abortController.current = new AbortController();
// Add placeholder for assistant message
// Index will be current length + 1 (we just added user message)
const assistantIndex = messages.length + 1;
setMessages((prev) => [
...prev,
{ role: 'assistant', content: '', isStreaming: true },
]);
try {
await fetchEventSource('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
prompt: userPrompt,
conversationId,
specificationId,
}),
signal: abortController.current.signal,
onmessage(ev) {
const event = JSON.parse(ev.data);
switch (event.type) {
case 'conversation_started':
// Save conversation ID for next messages
setConversationId(event.conversationId);
break;
case 'message_update':
// Update streaming message in real-time
setMessages((prev) => {
const updated = [...prev];
updated[assistantIndex] = {
role: 'assistant',
content: event.message.message,
isStreaming: true,
tokens: event.message.tokens,
throughput: event.message.throughput,
};
return updated;
});
break;
case 'message_completed':
// Mark message as complete
setMessages((prev) => {
const updated = [...prev];
updated[assistantIndex] = {
role: 'assistant',
content: event.message.message,
isStreaming: false,
tokens: event.message.tokens,
};
return updated;
});
setIsSubmitting(false);
break;
case 'error':
console.error('Stream error:', event.error);
// Show error in chat
setMessages((prev) => {
const updated = [...prev];
updated[assistantIndex] = {
role: 'assistant',
content: `Error: ${event.error}`,
isStreaming: false,
};
return updated;
});
setIsSubmitting(false);
break;
}
},
onerror(error) {
console.error('Connection error:', error);
setIsSubmitting(false);
throw error; // Stop retrying
},
});
} catch (error) {
console.error('Failed to send message:', error);
setIsSubmitting(false);
}
};
const stopStreaming = () => {
// Abort the ongoing fetch request, stopping the stream
abortController.current?.abort();
setIsSubmitting(false);
};
return (
<div className="flex flex-col h-screen max-w-2xl mx-auto p-4">
{/* Messages */}
<div className="flex-1 overflow-y-auto space-y-4 mb-4">
{messages.map((msg, idx) => (
<div
key={idx}
className={`p-4 rounded-lg ${
msg.role === 'user'
? 'bg-blue-100 ml-auto max-w-[80%]'
: 'bg-gray-100 mr-auto max-w-[80%]'
}`}
>
<div className="font-semibold text-sm mb-1">
{msg.role === 'user' ? 'You' : 'Assistant'}
</div>
<div className="whitespace-pre-wrap">{msg.content}</div>
{/* Show streaming metrics */}
{msg.isStreaming && msg.throughput && (
<div className="text-xs text-gray-500 mt-2">
⚡ Streaming at {msg.throughput.toFixed(0)} tokens/sec
</div>
)}
{!msg.isStreaming && msg.tokens && (
<div className="text-xs text-gray-500 mt-2">
📊 {msg.tokens} tokens
</div>
)}
</div>
))}
</div>
{/* Input */}
<div className="flex gap-2">
<input
type="text"
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyDown={(e) => e.key === 'Enter' && !e.shiftKey && sendMessage()}
placeholder="Type a message..."
disabled={isSubmitting}
className="flex-1 p-3 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-blue-500"
/>
<button
onClick={isSubmitting ? stopStreaming : sendMessage}
disabled={!input.trim() && !isSubmitting}
className="px-6 py-3 bg-blue-600 text-white rounded-lg hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed font-medium"
>
{isSubmitting ? 'Stop' : 'Send'}
</button>
</div>
</div>
);
}
Using the Component
File: app/page.tsx
import { StreamingChat } from '@/components/StreamingChat';
export default function Home() {
// Replace with your specification ID
const specificationId = 'your-specification-id';
return (
<main>
<StreamingChat specificationId={specificationId} />
</main>
);
}
Step 6: Add Tool Calling (Optional)
Let the AI call functions during the conversation.
How Tool Calling Works
When you provide tools to streamAgent
:
- AI decides if it needs to call a tool based on the user's message
- Graphlit pauses generation and fires
tool_call_started
event - Your toolHandler executes - you call your API/function and return data
- Graphlit feeds result back to the AI
- AI continues generating using the tool result
Example conversation flow:
User: "What's the weather in SF?"
AI: *thinks* → I need the get_weather tool
tool_call_started → { name: "get_weather", arguments: { location: "SF" } }
Your code executes → returns { temperature: 72, condition: "sunny" }
tool_call_completed
AI: *continues* → "The weather in San Francisco is 72°F and sunny!"
Important: The AI decides when to use tools. You can't force it. Make tool descriptions clear so the AI knows when to use them.
Define Tools
const tools: Types.ToolDefinition[] = [
{
name: 'get_weather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name, e.g. "San Francisco"'
},
},
required: ['location'],
},
},
{
name: 'search_knowledge',
description: 'Search the user\'s knowledge base',
parameters: {
type: 'object',
properties: {
query: {
type: 'string',
description: 'Search query'
},
},
required: ['query'],
},
},
];
Implement Tool Handler
const toolHandler = async (toolCall: any) => {
console.log('🔧 Tool called:', toolCall.name, toolCall.arguments);
switch (toolCall.name) {
case 'get_weather':
const { location } = toolCall.arguments;
// Call your weather API here
return {
location,
temperature: 72,
condition: 'sunny',
humidity: 45,
};
case 'search_knowledge':
const { query } = toolCall.arguments;
// Search your database/API here
return {
results: [
{ title: 'Document 1', content: '...' },
{ title: 'Document 2', content: '...' },
],
};
default:
return { error: 'Unknown tool' };
}
};
Use Tools in streamAgent
await client.streamAgent(
"What's the weather in San Francisco?",
async (event) => {
switch (event.type) {
case 'tool_call_started':
console.log('🔧 Calling tool:', event.toolCall.name);
break;
case 'tool_call_completed':
console.log('✅ Tool result:', event.toolCall.result);
break;
case 'message_update':
console.log('💬 Response:', event.message.message);
break;
}
},
conversationId,
{ id: specificationId },
tools, // Pass tool definitions
toolHandler // Pass tool handler
);
Key Concepts
Conversation Continuity
Pass the same conversationId
for multi-turn conversations:
// First message
await client.streamAgent("Hello", handler);
// conversationId is returned in 'conversation_started' event
// Second message (continues conversation)
await client.streamAgent("How are you?", handler, conversationId);
How it works: Graphlit stores all messages in the conversation. When you pass a conversationId
, the full message history is automatically included in the context sent to the AI. The AI sees:
[Previous messages...]
User: "Hello"
Assistant: "Hi! How can I help?"
User: "How are you?"
This is why the AI can reference earlier parts of the conversation.
Token Metrics
Track token usage and streaming performance:
case 'message_update':
const { tokens, throughput, completionTime } = event.message;
console.log('Tokens generated:', tokens);
console.log('Speed:', throughput, 'tokens/sec');
console.log('Time elapsed:', completionTime, 'seconds');
break;
Context from Your Data
If you've indexed content in Graphlit (documents, emails, Slack, etc.), the AI automatically includes relevant context in responses. No additional configuration needed.
Error Handling
Always handle errors gracefully:
case 'error':
console.error('Stream error:', event.error);
// Show user-friendly messages based on error type
if (event.error.includes('401') || event.error.includes('Unauthorized')) {
alert('Authentication failed. Check your credentials.');
} else if (event.error.includes('credits') || event.error.includes('402')) {
alert('You've reached your usage limit. Upgrade your plan.');
} else {
alert('Something went wrong. Please try again.');
}
break;
Use AbortController
to cancel streaming:
const abortController = new AbortController();
// Start streaming
fetchEventSource('/api/chat', {
signal: abortController.signal,
// ...
});
// Cancel anytime
abortController.abort();
Complete Example Structure
Here's the complete project structure:
my-streaming-chat/
├── app/
│ ├── api/
│ │ └── chat/
│ │ └── route.ts # Streaming API endpoint
│ ├── page.tsx # Main page
│ └── layout.tsx
├── components/
│ └── StreamingChat.tsx # Chat UI component
├── .env.local # Environment variables
├── package.json
├── tsconfig.json
└── next.config.js
Environment Variables
.env.local:
GRAPHLIT_ORGANIZATION_ID=your_org_id
GRAPHLIT_ENVIRONMENT_ID=your_env_id
GRAPHLIT_JWT_SECRET=your_jwt_secret
Package.json
{
"dependencies": {
"next": "^15.0.0",
"react": "^18.0.0",
"graphlit-client": "^1.0.0",
"@microsoft/fetch-event-source": "^2.0.1"
}
}
Testing Your Implementation
1. Create a Specification
// Run this once to create your specification
const spec = await client.createSpecification({
type: Types.SpecificationTypes.Completion,
serviceType: Types.ModelServiceTypes.OpenAi,
name: 'Test Chat',
systemPrompt: 'You are a helpful assistant.',
openAI: {
model: Types.OpenAiModels.Gpt41Mini_1024K,
temperature: 0.5,
},
});
console.log('Specification ID:', spec.createSpecification?.id);
// Save this ID for your component
2. Run Your App
npm run dev
3. Test the Chat
Open http://localhost:3000 and try these prompts:
- "Hello, how are you?"
- "What's 2 + 2?"
- "Write me a haiku about coding"
Watch the response stream in real-time!
Advanced: Streaming with Context
If you've indexed documents in Graphlit, the AI automatically uses them as context:
// First, ingest some documents
await client.ingestUri('https://example.com/docs', {
workflow: { id: workflowId }
});
// Then chat - AI will reference your docs
await client.streamAgent(
"Summarize the main points from the documentation",
eventHandler,
conversationId,
{ id: specificationId }
);
The AI will automatically search your indexed content and include relevant excerpts in its response.
What You've Learned
In this tutorial, you built a complete streaming chat application:
✅ Created AI specifications for different models
✅ Managed conversations with context and history
✅ Implemented real-time streaming with streamAgent
✅ Built a Next.js API route with SSE
✅ Created a React chat component with streaming UI
✅ Added tool calling for function execution
✅ Handled errors and metrics properly
Next Steps
Now that you have streaming chat working, explore these features:
- Add file upload - Let users upload documents to discuss
- Implement conversation branching - Fork conversations at any point
- Build multi-modal chat - Support images and audio
- Add conversation search - Find past conversations
- Implement RAG - Search and cite sources in responses
Resources
- Graphlit Documentation - Complete API reference
- TypeScript SDK - NPM package
- Sample Projects - GitHub examples
- Discord Community - Get help from the team
Get Started
Ready to build your streaming chat?
- Sign up at portal.graphlit.dev
- Get your credentials (Organization ID, Environment ID, JWT Secret)
- Copy the code from this tutorial
- Start building
Questions? Join our Discord or check the docs.
Built something cool with Graphlit streaming? Share it on Twitter/X and tag @graphlit!