Getting Started
Get Ada running in 5 minutes!
Local Mode (Recommended)
The fastest way to start Ada - no Docker required!
Requirements
Python 3.13+
Don’t have Python 3.13? Use Nix! See Nix & Flakes Support for instant Python 3.13 environment.
Ollama (LLM backend)
8GB+ RAM recommended
GPU optional (CUDA, ROCm, or Metal)
Warning
Ubuntu Users: If you have Docker installed via snap, it won’t work with Docker Compose properly.
Quick fix:
# Remove snap Docker
sudo snap remove docker
# Install official Docker (see https://docs.docker.com/engine/install/ubuntu/)
# Then add your user to docker group
sudo usermod -aG docker $USER
newgrp docker
Quick Setup
Install Ollama:
# Get from https://ollama.ai curl -fsSL https://ollama.com/install.sh | sh ollama serve
Clone and setup Ada:
git clone https://github.com/luna-system/ada.git cd ada python3 ada_main.py setup
The setup wizard will: - Create a virtual environment - Install dependencies - Create .env configuration file
Pull a model:
ollama pull qwen2.5-coder:7b
Start Ada:
ada runAda will auto-detect your local Ollama and start at http://localhost:7000
Verify health:
ada doctor # or curl http://localhost:7000/v1/healthz
That’s it! You’re running Ada locally. For complete local mode documentation, see the Running Ada Locally (No Docker Required!) guide.
Docker Mode (Optional)
Need isolated services or multi-container orchestration? Ada supports Docker too!
Requirements
Docker & Docker Compose
Docker BuildX (recommended)
20GB+ disk space
GPU optional (with proper passthrough)
Quick Setup
git clone https://github.com/luna-system/ada.git
cd ada
# Start with default services
docker compose up -d
# Or with web UI
docker compose --profile web up -d
# Or with Matrix bridge
docker compose --profile matrix up -d
# With GPU support
docker compose --profile cuda up -d # NVIDIA
docker compose --profile rocm up -d # AMD
Docker starts: - Ollama LLM backend (port 11434) - ChromaDB vector store (port 8000) - Brain API (port 7000) - Optional: Web UI (port 5000), Matrix bridge
See docs/external_ollama.md for hybrid setups (local Ollama + Docker services).
Configuration
Environment Variables
Copy .env.example to .env and configure:
cp .env.example .env
Key variables:
OLLAMA_BASE_URL- LLM backend URL (default: http://localhost:11434)OLLAMA_MODEL- Model name (default: qwen2.5-coder:7b)CHROMA_URL- Vector DB URL (default: http://localhost:8000)RAG_ENABLED- Enable RAG features (default: true)RAG_ENABLE_PERSONA- Load persona/style guidelines (default: true)RAG_ENABLE_FAQ- Include FAQ retrieval (default: true)RAG_ENABLE_MEMORY- Include memory retrieval (default: true)RAG_ENABLE_SUMMARY- Auto-generate conversation summaries (default: true)RAG_DEBUG- Enable debug endpoints (default: false)
For complete configuration reference, see Configuration Reference.
Tip
Runtime Configuration Discovery: Query GET /v1/info to see the current active configuration, enabled features, and models in use.
Running the Service
Start Brain API
cd /home/luna/Code/ada-v1
source .venv/bin/activate
python brain/app.py
Or with Gunicorn (production):
gunicorn -b 0.0.0.0:7000 brain.app:app
Start Web Server
cd /home/luna/Code/ada-v1
source .venv/bin/activate
python webserver/app.py
Or with Gunicorn:
gunicorn -b 0.0.0.0:5000 webserver.app:app
Using Docker Compose
Start all services:
docker compose up
Stop services:
docker compose down
View logs:
docker compose logs -f brain
docker compose logs -f web
Your First Request
Health Check
curl http://localhost:7000/v1/healthz
Response:
{
"ok": true,
"service": "brain",
"python": "3.13.0",
"config": {
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_MODEL": "qwen2.5-coder:7b",
...
},
"persona": {"loaded": true},
"chroma": {"ok": true}
}
Simple Chat
curl -X POST http://localhost:7000/v1/chat \
-H "Content-Type: application/json" \
-d '{
"prompt": "What is the capital of France?",
"include_thinking": false
}'
Response:
{
"response": "The capital of France is Paris, located in northern France on the Seine River. It is the largest city in France and serves as the country's political, economic, and cultural center.",
"thinking": null,
"conversation_id": "550e8400-e29b-41d4-a716-446655440000",
"user_timestamp": "2025-12-13T10:30:45.123456+00:00",
"assistant_timestamp": "2025-12-13T10:30:48.456789+00:00",
"request_id": "abc12345",
"used_context": {
"persona": {"included": true, "version": null, "timestamp": null},
"faqs": [],
"turns": [],
"memories": [],
"summaries": [],
"entity": null
}
}
Streaming Response
For real-time token delivery, see Streaming.
Troubleshooting
Service Won’t Start
Issue: Connection refused to Ollama
Solution:
Ensure Ollama is running:
docker ps | grep ollama
Check Ollama API:
curl http://localhost:11434/api/tagsIf not running, start with Docker Compose:
docker compose up -d ollama
Issue: ChromaDB connection failed
Solution:
Verify Chroma is running:
curl http://localhost:8000/api/v1/heartbeatRestart if needed:
docker compose restart chroma
See Architecture for details on the RAG infrastructure.
Health Check Failing
Issue: GET /v1/healthz returns 503
Solution:
The health check probes dependencies. Check which one is failing:
Ollama:
curl http://localhost:11434/api/tagsChromaDB:
curl http://localhost:8000/api/v1/heartbeatRAG system:
curl "http://localhost:7000/v1/debug/rag"
Memory Not Saving
Issue: POST /v1/memory returns 503
Solution:
Ensure RAG is enabled in .env:
RAG_ENABLED=true
CHROMA_URL=http://localhost:8000
Then restart Brain API. See Configuration Reference for complete RAG configuration options and Memory for memory management patterns.
Debugging
Enable Debug Mode
# In .env
RAG_DEBUG=true
Then restart services.
View Debug Info
# Get RAG statistics
curl "http://localhost:7000/v1/debug/rag?conversation_id=your-uuid"
View Logs
With Docker:
docker compose logs -f brain
Direct:
# Set log level
export LOG_LEVEL=DEBUG
python brain/app.py
Check Imports
Verify all dependencies are installed:
source .venv/bin/activate
python -c "from brain.app import *; print('✓ All imports OK')"
Next Steps
Quick Wins:
Code Completion - NEW! Copilot-style autocomplete in Neovim (v2.6+)
Ada Log Intelligence - NEW! Minecraft crash analysis + DevOps log intelligence (v2.7+)
API Reference - Complete endpoint documentation
Examples - Code examples in multiple languages
Core Features:
Streaming - Real-time SSE streaming implementation
Memory - Long-term memory management
Specialist System - Plugin system for extended capabilities
Configuration & Development:
Configuration Reference - Full configuration reference
Testing Guide - Testing guide and best practices
Development Tools - Contributing guide