Getting Started

Get Ada running in 5 minutes!

Local Mode (Recommended)

The fastest way to start Ada - no Docker required!

Requirements

Python 3.13+

Don’t have Python 3.13? Use Nix! See Nix & Flakes Support for instant Python 3.13 environment.
Ollama (LLM backend)
8GB+ RAM recommended
GPU optional (CUDA, ROCm, or Metal)

Warning

Ubuntu Users: If you have Docker installed via snap, it won’t work with Docker Compose properly.

Quick fix:

# Remove snap Docker
sudo snap remove docker

# Install official Docker (see https://docs.docker.com/engine/install/ubuntu/)
# Then add your user to docker group
sudo usermod -aG docker $USER
newgrp docker

Quick Setup

Install Ollama:

# Get from https://ollama.ai
curl -fsSL https://ollama.com/install.sh | sh
ollama serve

Clone and setup Ada:
```
git clone https://github.com/luna-system/ada.git
cd ada
python3 ada_main.py setup
```
The setup wizard will: - Create a virtual environment - Install dependencies - Create .env configuration file
Pull a model:
```
ollama pull qwen2.5-coder:7b
```
Start Ada:
```
ada run
```
Ada will auto-detect your local Ollama and start at http://localhost:7000

Verify health:

ada doctor
# or
curl http://localhost:7000/v1/healthz

That’s it! You’re running Ada locally. For complete local mode documentation, see the Running Ada Locally (No Docker Required!) guide.

Docker Mode (Optional)

Need isolated services or multi-container orchestration? Ada supports Docker too!

Requirements

Docker & Docker Compose
Docker BuildX (recommended)
20GB+ disk space
GPU optional (with proper passthrough)

Quick Setup

git clone https://github.com/luna-system/ada.git
cd ada

# Start with default services
docker compose up -d

# Or with web UI
docker compose --profile web up -d

# Or with Matrix bridge
docker compose --profile matrix up -d

# With GPU support
docker compose --profile cuda up -d  # NVIDIA
docker compose --profile rocm up -d  # AMD

Docker starts: - Ollama LLM backend (port 11434) - ChromaDB vector store (port 8000) - Brain API (port 7000) - Optional: Web UI (port 5000), Matrix bridge

See docs/external_ollama.md for hybrid setups (local Ollama + Docker services).

Configuration

Environment Variables

Copy .env.example to .env and configure:

cp .env.example .env

Key variables:

OLLAMA_BASE_URL - LLM backend URL (default: http://localhost:11434)
OLLAMA_MODEL - Model name (default: qwen2.5-coder:7b)
CHROMA_URL - Vector DB URL (default: http://localhost:8000)
RAG_ENABLED - Enable RAG features (default: true)
RAG_ENABLE_PERSONA - Load persona/style guidelines (default: true)
RAG_ENABLE_FAQ - Include FAQ retrieval (default: true)
RAG_ENABLE_MEMORY - Include memory retrieval (default: true)
RAG_ENABLE_SUMMARY - Auto-generate conversation summaries (default: true)
RAG_DEBUG - Enable debug endpoints (default: false)

For complete configuration reference, see Configuration Reference.

Tip

Runtime Configuration Discovery: Query GET /v1/info to see the current active configuration, enabled features, and models in use.

Running the Service

Start Brain API

cd /home/luna/Code/ada-v1
source .venv/bin/activate
python brain/app.py

Or with Gunicorn (production):

gunicorn -b 0.0.0.0:7000 brain.app:app

Start Web Server

cd /home/luna/Code/ada-v1
source .venv/bin/activate
python webserver/app.py

Or with Gunicorn:

gunicorn -b 0.0.0.0:5000 webserver.app:app

Using Docker Compose

Start all services:

docker compose up

Stop services:

docker compose down

View logs:

docker compose logs -f brain
docker compose logs -f web

Your First Request

Health Check

curl http://localhost:7000/v1/healthz

Response:

{
  "ok": true,
  "service": "brain",
  "python": "3.13.0",
  "config": {
    "OLLAMA_BASE_URL": "http://localhost:11434",
       "OLLAMA_MODEL": "qwen2.5-coder:7b",
    ...
  },
  "persona": {"loaded": true},
  "chroma": {"ok": true}
}

Simple Chat

curl -X POST http://localhost:7000/v1/chat \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is the capital of France?",
    "include_thinking": false
  }'

Response:

{
  "response": "The capital of France is Paris, located in northern France on the Seine River. It is the largest city in France and serves as the country's political, economic, and cultural center.",
  "thinking": null,
  "conversation_id": "550e8400-e29b-41d4-a716-446655440000",
  "user_timestamp": "2025-12-13T10:30:45.123456+00:00",
  "assistant_timestamp": "2025-12-13T10:30:48.456789+00:00",
  "request_id": "abc12345",
  "used_context": {
    "persona": {"included": true, "version": null, "timestamp": null},
    "faqs": [],
    "turns": [],
    "memories": [],
    "summaries": [],
    "entity": null
  }
}

Streaming Response

For real-time token delivery, see Streaming.

Troubleshooting

Service Won’t Start

Issue: Connection refused to Ollama

Solution:

Ensure Ollama is running:
```
docker ps | grep ollama
```
Check Ollama API:
```
curl http://localhost:11434/api/tags
```
If not running, start with Docker Compose:
```
docker compose up -d ollama
```

Issue: ChromaDB connection failed

Solution:

Verify Chroma is running:

curl http://localhost:8000/api/v1/heartbeat

Restart if needed:
```
docker compose restart chroma
```

See Architecture for details on the RAG infrastructure.

Health Check Failing

Issue: GET /v1/healthz returns 503

Solution:

The health check probes dependencies. Check which one is failing:

Ollama:
```
curl http://localhost:11434/api/tags
```

ChromaDB:

curl http://localhost:8000/api/v1/heartbeat

RAG system:

curl "http://localhost:7000/v1/debug/rag"

Memory Not Saving

Issue: POST /v1/memory returns 503

Solution:

Ensure RAG is enabled in .env:

RAG_ENABLED=true
CHROMA_URL=http://localhost:8000

Then restart Brain API. See Configuration Reference for complete RAG configuration options and Memory for memory management patterns.

Debugging

Enable Debug Mode

# In .env
RAG_DEBUG=true

Then restart services.

View Debug Info

# Get RAG statistics
curl "http://localhost:7000/v1/debug/rag?conversation_id=your-uuid"

View Logs

With Docker:

docker compose logs -f brain

Direct:

# Set log level
export LOG_LEVEL=DEBUG
python brain/app.py

Check Imports

Verify all dependencies are installed:

source .venv/bin/activate
python -c "from brain.app import *; print('✓ All imports OK')"

Next Steps

Quick Wins:

Code Completion - NEW! Copilot-style autocomplete in Neovim (v2.6+)
Ada Log Intelligence - NEW! Minecraft crash analysis + DevOps log intelligence (v2.7+)
API Reference - Complete endpoint documentation
Examples - Code examples in multiple languages

Core Features:

Streaming - Real-time SSE streaming implementation
Memory - Long-term memory management
Specialist System - Plugin system for extended capabilities

Configuration & Development:

Configuration Reference - Full configuration reference
Testing Guide - Testing guide and best practices
Development Tools - Contributing guide