API Reference
Brain API Endpoints
Endpoint Discovery
The API is fully self-documenting! Query these endpoints to discover available functionality:
GET /v1/info — System capabilities and complete endpoint list
curl http://localhost:5000/api/info | jq
Returns:
Service version and Python version
Feature flags (RAG, specialists, streaming, etc.)
Active models (LLM and embedding)
Complete list of all available endpoints
Documentation URL
GET /v1/specialists — Available specialist capabilities
curl http://localhost:5000/api/specialists | jq
Returns specialist names, descriptions, schemas, priorities, and enabled status.
GET /v1/schema — Data model schemas
curl http://localhost:5000/api/schema | jq
Returns JSON Schema definitions for all document types (persona, faq, memory, turn, summary).
See Data Model Reference for complete schema documentation.
Key Endpoints
The following endpoints are commonly used. For the complete list, query GET /v1/info.
For specialist system endpoints, see Specialist System. For data model schemas, see Data Model Reference.
Chat & Streaming
POST /v1/chat/stream — Streaming chat via Server-Sent Events
See Streaming for detailed SSE event structure.
Memory Management
GET /v1/memory — Search and list memories
POST /v1/memory — Create new memory
DELETE /v1/memory/{id} — Delete memory
See Memory for memory management patterns.
Health & Diagnostics
GET /v1/healthz — Service health and dependency status
GET /v1/debug/rag — RAG system diagnostics (when RAG_DEBUG=true)
GET /v1/debug/prompt — Inspect prompt construction
See Configuration Reference for debug configuration and Testing Guide for testing strategies.
Request/Response Models
Chat Request
{
"prompt": "Your question here",
"conversation_id": "uuid-optional",
"include_thinking": false,
"entity": "optional-topic",
"save_memory": false,
"memory_text": "optional-custom-text",
"turns_k": 3,
"faq_k": 3,
"memory_k": 5
}
Chat Response
{
"response": "Assistant response text",
"thinking": "Reasoning (if enabled)",
"conversation_id": "uuid-here",
"used_context": {
"persona": {"included": true},
"faqs": ["faq text snippets"],
"turns": ["previous exchanges"],
"memories": ["relevant memories"],
"summaries": ["conversation summaries"],
"entity": null
},
"user_timestamp": "2025-12-13T10:30:45.123456+00:00",
"assistant_timestamp": "2025-12-13T10:30:48.456789+00:00",
"request_id": "abc12345"
}
Stream Events (SSE)
Token Event:
{
"type": "token",
"content": "Hello "
}
Thinking Event:
{
"type": "thinking",
"content": "The user is asking..."
}
Done Event:
{
"type": "done",
"conversation_id": "uuid",
"used_context": [],
"user_timestamp": "2024-01-01T00:00:00Z",
"assistant_timestamp": "2024-01-01T00:00:01Z",
"request_id": "req-123"
}
Error Event:
{
"type": "error",
"error": "Error message"
}
Memory Request
{
"text": "Memory content to store",
"importance": 3,
"scope": "global",
"entity": "optional-topic"
}
Memory Response
{
"id": "memory-uuid",
"text": "Memory content",
"meta": {
"importance": 3,
"timestamp": "2025-12-13T10:30:45.123456+00:00",
"source": "chat",
"scope": "global",
"entity": null
}
}
Status Codes
- 200 OK
Request succeeded. Response in body.
- 201 Created
Resource created successfully.
- 400 Bad Request
Invalid parameters or missing required fields.
- 404 Not Found
Endpoint not found or feature disabled.
- 500 Internal Server Error
Server error (Ollama down, database error, etc).
- 503 Service Unavailable
Critical dependency unavailable (RAG, database).
HTTP Status Code Reference
Code |
Meaning |
Common Causes |
|---|---|---|
200 |
Success |
Valid request processed |
201 |
Created |
Memory created successfully |
400 |
Bad Request |
Missing prompt, invalid JSON |
404 |
Not Found |
Debug disabled (RAG_DEBUG=false) |
500 |
Server Error |
Ollama timeout, DB error |
503 |
Unavailable |
RAG disabled, Chroma down |