API Reference

Brain API Endpoints

Endpoint Discovery

The API is fully self-documenting! Query these endpoints to discover available functionality:

GET /v1/info — System capabilities and complete endpoint list

curl http://localhost:5000/api/info | jq

Returns:

  • Service version and Python version

  • Feature flags (RAG, specialists, streaming, etc.)

  • Active models (LLM and embedding)

  • Complete list of all available endpoints

  • Documentation URL

GET /v1/specialists — Available specialist capabilities

curl http://localhost:5000/api/specialists | jq

Returns specialist names, descriptions, schemas, priorities, and enabled status.

GET /v1/schema — Data model schemas

curl http://localhost:5000/api/schema | jq

Returns JSON Schema definitions for all document types (persona, faq, memory, turn, summary).

See Data Model Reference for complete schema documentation.

Key Endpoints

The following endpoints are commonly used. For the complete list, query GET /v1/info.

For specialist system endpoints, see Specialist System. For data model schemas, see Data Model Reference.

Chat & Streaming

POST /v1/chat/stream — Streaming chat via Server-Sent Events

See Streaming for detailed SSE event structure.

Memory Management

GET /v1/memory — Search and list memories

POST /v1/memory — Create new memory

DELETE /v1/memory/{id} — Delete memory

See Memory for memory management patterns.

Health & Diagnostics

GET /v1/healthz — Service health and dependency status

GET /v1/debug/rag — RAG system diagnostics (when RAG_DEBUG=true)

GET /v1/debug/prompt — Inspect prompt construction

See Configuration Reference for debug configuration and Testing Guide for testing strategies.

Request/Response Models

Chat Request

{
  "prompt": "Your question here",
  "conversation_id": "uuid-optional",
  "include_thinking": false,
  "entity": "optional-topic",
  "save_memory": false,
  "memory_text": "optional-custom-text",
  "turns_k": 3,
  "faq_k": 3,
  "memory_k": 5
}

Chat Response

{
  "response": "Assistant response text",
  "thinking": "Reasoning (if enabled)",
  "conversation_id": "uuid-here",
  "used_context": {
    "persona": {"included": true},
    "faqs": ["faq text snippets"],
    "turns": ["previous exchanges"],
    "memories": ["relevant memories"],
    "summaries": ["conversation summaries"],
    "entity": null
  },
  "user_timestamp": "2025-12-13T10:30:45.123456+00:00",
  "assistant_timestamp": "2025-12-13T10:30:48.456789+00:00",
  "request_id": "abc12345"
}

Stream Events (SSE)

Token Event:

{
  "type": "token",
  "content": "Hello "
}

Thinking Event:

{
  "type": "thinking",
  "content": "The user is asking..."
}

Done Event:

{
  "type": "done",
  "conversation_id": "uuid",
    "used_context": [],
    "user_timestamp": "2024-01-01T00:00:00Z",
    "assistant_timestamp": "2024-01-01T00:00:01Z",
    "request_id": "req-123"
}

Error Event:

{
  "type": "error",
  "error": "Error message"
}

Memory Request

{
  "text": "Memory content to store",
  "importance": 3,
  "scope": "global",
  "entity": "optional-topic"
}

Memory Response

{
  "id": "memory-uuid",
  "text": "Memory content",
  "meta": {
    "importance": 3,
    "timestamp": "2025-12-13T10:30:45.123456+00:00",
    "source": "chat",
    "scope": "global",
    "entity": null
  }
}

Status Codes

200 OK

Request succeeded. Response in body.

201 Created

Resource created successfully.

400 Bad Request

Invalid parameters or missing required fields.

404 Not Found

Endpoint not found or feature disabled.

500 Internal Server Error

Server error (Ollama down, database error, etc).

503 Service Unavailable

Critical dependency unavailable (RAG, database).

HTTP Status Code Reference

Code

Meaning

Common Causes

200

Success

Valid request processed

201

Created

Memory created successfully

400

Bad Request

Missing prompt, invalid JSON

404

Not Found

Debug disabled (RAG_DEBUG=false)

500

Server Error

Ollama timeout, DB error

503

Unavailable

RAG disabled, Chroma down