REST API Chat Integration
This guide covers everything you need to build a client that communicates with ByteBrew Engine over the REST API with SSE streaming.
SSE event types reference
Section titled “SSE event types reference”When you send a message to POST /api/v1/schemas/{name}/chat, the engine responds with a stream of Server-Sent Events. Each event has a type field in the event name and a JSON data payload.
| Event | Data fields | Description |
|---|---|---|
message_delta | content | Streaming token. A partial text chunk from the agent. Concatenate all message_delta events for the full response. |
message | content, role | Complete message. Sent when a full message is available (non-streaming mode or final assembly). |
thinking | content | Reasoning started. The agent is processing internally. Contains partial reasoning text (if the model supports it). |
tool_call | tool, input | Tool execution started. Contains the tool name and the input parameters the agent provided. |
tool_result | tool, output, error | Tool execution completed. Contains the tool output or error message. |
structured_output | output_type, title, rows, actions, questions | Agent emitted structured data (table, info block, or form). Client renders the block; for form mode the user’s reply arrives as the next chat message. |
confirmation | tool, input, call_id | Requires user approval. A tool with confirm_before is about to execute. Send approval via the confirmation endpoint. |
done | session_id, tokens | Session completed. Contains the session ID for resuming and total token count. |
error | message, code | Error occurred. The stream terminates after this event. |
Example: full event stream
Section titled “Example: full event stream”event: thinkingdata: {"content":"Let me search for that information..."}
event: tool_calldata: {"tool":"search_products","input":{"query":"laptops under 1000"}}
event: tool_resultdata: {"tool":"search_products","output":"[{\"name\":\"ProBook 450\",\"price\":849}]"}
event: message_deltadata: {"content":"I found "}
event: message_deltadata: {"content":"several options for you:\n\n"}
event: message_deltadata: {"content":"1. **ProBook 450** — $849"}
event: donedata: {"session_id":"sess_abc123","tokens":156}Handling confirmation events
Section titled “Handling confirmation events”When a tool has confirm_before configured, the stream pauses with a confirmation event:
event: confirmationdata: {"tool":"create_order","input":{"customer_id":"cust_123","items":"ProBook 450 x1"},"call_id":"conf_xyz"}To approve or reject:
# Approvecurl -X POST http://localhost:8443/api/v1/sessions/{session_id}/respond \ -H "Authorization: Bearer bb_your_token" \ -H "Content-Type: application/json" \ -d '{"call_id": "conf_xyz", "answers": ["approve"]}'
# Rejectcurl -X POST http://localhost:8443/api/v1/sessions/{session_id}/respond \ -H "Authorization: Bearer bb_your_token" \ -H "Content-Type: application/json" \ -d '{"call_id": "conf_xyz", "answers": ["reject: Customer changed their mind"]}'Handling structured output events
Section titled “Handling structured output events”When an agent calls show_structured_output, the engine emits a structured_output SSE event. The event is non-blocking — the agent’s turn ends immediately after emitting it.
event: structured_outputdata: {"output_type":"summary_table","title":"Project Summary","rows":[{"label":"Name","value":"MyApp"},{"label":"Status","value":"Active"},{"label":"Users","value":"1,234"}],"actions":[{"label":"Deploy","type":"primary","value":"deploy"},{"label":"Cancel","type":"secondary","value":"cancel"}]}| Field | Description |
|---|---|
output_type | Type of structured output: summary_table, info, or form. |
title | Optional title for the output block. |
description | Optional description text. |
rows | Array of {label, value} pairs for table display (summary_table mode). |
actions | Array of {label, type, value} action buttons (type: primary or secondary). |
questions | Array of input question objects in form mode (see below). |
Form mode: collecting user input
Section titled “Form mode: collecting user input”In form mode the agent emits a structured form and its turn ends. The client renders the form and the user’s answers arrive as the next chat message — no separate respond endpoint needed.
event: structured_outputdata: {"output_type":"form","title":"Leave request","questions":[{"id":"leave_type","label":"What type of leave?","type":"select","options":[{"label":"Vacation","value":"vacation"},{"label":"Sick","value":"sick"}]},{"id":"dates","label":"What dates? (start – end)","type":"text"}]}Each question object:
| Field | Required | Description |
|---|---|---|
id | Yes | Stable identifier returned with the answer. |
label | Yes | Question text shown to the user. |
type | Yes | text, select, or multiselect. |
options | select/multiselect | Array of 2–5 options with label (and optional value). |
default | No | Default value pre-filled for the user. |
The client submits answers as the next user message. The agent receives the answers in the next turn and continues processing.
Display-only output
Section titled “Display-only output”For summary_table and info modes, the client renders the block but does not need to respond. If action buttons are present, clicking one sends the button’s value back as a regular chat message.
Session management
Section titled “Session management”Creating a new session
Section titled “Creating a new session”Omit session_id to start a new conversation:
curl -N http://localhost:8443/api/v1/schemas/{schema_id}/chat \ -H "Authorization: Bearer bb_your_token" \ -H "Content-Type: application/json" \ -d '{"message": "Hello, I need help with my order"}'The done event returns a session_id. Save it for continuations.
Resuming a session
Section titled “Resuming a session”Pass session_id to continue a conversation with full history:
curl -N http://localhost:8443/api/v1/schemas/{schema_id}/chat \ -H "Authorization: Bearer bb_your_token" \ -H "Content-Type: application/json" \ -d '{"message": "Can you check order #12345?", "session_id": "sess_abc123"}'Listing sessions
Section titled “Listing sessions”curl "http://localhost:8443/api/v1/sessions?agent=my-agent&limit=20" \ -H "Authorization: Bearer bb_your_token"Deleting a session
Section titled “Deleting a session”curl -X DELETE http://localhost:8443/api/v1/sessions/sess_abc123 \ -H "Authorization: Bearer bb_your_token"Non-streaming mode
Section titled “Non-streaming mode”For clients that cannot handle SSE, set stream: false in the request body:
curl http://localhost:8443/api/v1/schemas/{schema_id}/chat \ -H "Authorization: Bearer bb_your_token" \ -H "Content-Type: application/json" \ -d '{"message": "Hello", "stream": false}'Response (standard JSON, not SSE):
{ "response": "Hello! How can I help you today?", "session_id": "sess_abc123", "tokens": 42, "tool_calls": []}Authentication
Section titled “Authentication”All endpoints require a Bearer token in the Authorization header:
Authorization: Bearer bb_your_api_tokenTokens are created in Admin Dashboard -> API Keys. Each token has scopes that limit what it can access. For chat integrations, the chat scope is sufficient.
See API Reference: Authentication for details on scopes and token management.
Error handling
Section titled “Error handling”HTTP errors
Section titled “HTTP errors”| Status | Meaning |
|---|---|
400 | Bad request. Invalid JSON or missing required fields. |
401 | Unauthorized. Missing or invalid API token. |
403 | Forbidden. Token lacks the required scope. |
404 | Agent not found. Check the agent name in the URL. |
429 | Rate limited. Too many requests. Retry after the Retry-After header value. |
500 | Internal server error. Check engine logs. |
SSE error events
Section titled “SSE error events”Errors during streaming are sent as error events:
event: errordata: {"message":"Model returned an error: context length exceeded","code":"model_error"}The stream closes after an error event. Your client should reconnect or show the error to the user.
Retry strategy
Section titled “Retry strategy”For transient errors (429, 500), implement exponential backoff:
async function chatWithRetry(message, maxRetries = 3) { for (let attempt = 0; attempt < maxRetries; attempt++) { try { return await sendMessage(message); } catch (error) { if (attempt === maxRetries - 1) throw error; const delay = Math.pow(2, attempt) * 1000; await new Promise(resolve => setTimeout(resolve, delay)); } }}JavaScript SSE client example
Section titled “JavaScript SSE client example”Do NOT use EventSource — it only supports GET requests. Use fetch + ReadableStream for POST-based SSE:
const response = await fetch('http://localhost:8443/api/v1/schemas/{schema_id}/chat', { method: 'POST', headers: { 'Authorization': 'Bearer bb_your_token', 'Content-Type': 'application/json', }, body: JSON.stringify({ message: 'Hello', session_id: null }),});
const reader = response.body.getReader();const decoder = new TextDecoder();let buffer = '';
while (true) { const { done, value } = await reader.read(); if (done) break; buffer += decoder.decode(value, { stream: true }); const lines = buffer.split('\n'); buffer = lines.pop() || ''; let currentEvent = ''; for (const line of lines) { if (line.startsWith('event: ')) currentEvent = line.slice(7); if (line.startsWith('data: ')) { const data = JSON.parse(line.slice(6)); if (currentEvent === 'message_delta') console.log(data.content); if (currentEvent === 'done') console.log('Session:', data.session_id); } }}Rate limiting
Section titled “Rate limiting”The engine enforces rate limits per API token:
- Default: 60 requests per minute per token.
- Configurable in engine settings.
- Rate-limited responses return HTTP 429 with a
Retry-Afterheader.
Rate limit headers
Section titled “Rate limit headers”Every API response includes rate limit headers when configurable rate limiting is enabled (EE):
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum number of requests allowed in the current window. |
X-RateLimit-Remaining | Number of requests remaining in the current window. |
X-RateLimit-Reset | Unix timestamp (seconds) when the current window resets. |
HTTP/1.1 200 OKX-RateLimit-Limit: 500X-RateLimit-Remaining: 487X-RateLimit-Reset: 1711929600See Configuration: Rate Limits for setup.
Additional API endpoints
Section titled “Additional API endpoints”Tool call audit log (EE)
Section titled “Tool call audit log (EE)”Query tool call history for auditing and debugging. Requires admin scope.
curl "http://localhost:8443/api/v1/audit/tool-calls?agent=sales-agent&tool=create_order&page=1&per_page=20" \ -H "Authorization: Bearer bb_your_token"Query parameters
Section titled “Query parameters”| Parameter | Description |
|---|---|
session_id | Filter by session ID. |
agent | Filter by agent name. |
tool | Filter by tool name. |
status | Filter by status: completed or failed. |
user_id | Filter by user ID. |
from | Start date (RFC3339 or YYYY-MM-DD). |
to | End date (RFC3339 or YYYY-MM-DD). |
page | Page number (default: 1). |
per_page | Results per page (default: 50, max: 100). |
Response
Section titled “Response”{ "data": [ { "id": 42, "session_id": "sess_abc123", "agent_name": "sales-agent", "tool_name": "create_order", "input": "{\"customer_id\":\"cust_123\"}", "output": "{\"order_id\":\"ord_456\"}", "status": "completed", "duration_ms": 340, "user_id": "user_789", "created_at": "2026-03-20T14:30:00Z" } ], "total": 156, "page": 1, "per_page": 20, "total_pages": 8}Model registry
Section titled “Model registry”Browse the built-in catalog of known models and providers. No authentication required.
# List all modelscurl http://localhost:8443/api/v1/models/registry
# Filter by providercurl "http://localhost:8443/api/v1/models/registry?provider=anthropic"
# Filter by tiercurl "http://localhost:8443/api/v1/models/registry?tier=1"
# Filter by tool supportcurl "http://localhost:8443/api/v1/models/registry?supports_tools=true"
# List all providerscurl http://localhost:8443/api/v1/models/registry/providersSee Model Registry for full details.
Rate limit usage (EE)
Section titled “Rate limit usage (EE)”Check current rate limit usage for a specific key. Requires admin scope.
curl "http://localhost:8443/api/v1/rate-limits/usage?key_header=X-Org-Id&key_value=org-123" \ -H "Authorization: Bearer bb_your_token"{ "rule": "per-org", "key": "org-123", "tier": "pro", "used": 42, "limit": 500, "window": "24h0m0s", "resets_at": "2026-03-25T00:00:00Z"}Prometheus metrics (EE)
Section titled “Prometheus metrics (EE)”The engine exposes Prometheus-compatible metrics at /metrics. No authentication required.
curl http://localhost:8443/metricsSee Production: Prometheus Metrics for available metrics and Kubernetes integration.