# Running Ada Locally (No Docker Required!) Ada now supports running entirely locally without Docker! This is the **recommended** deployment mode for most users. ## Quick Start ### Prerequisites 1. **Python 3.13+** ```bash python3 --version # Should show 3.13 or higher ``` 2. **Ollama** (for LLM inference) ```bash # Install from https://ollama.ai curl -fsSL https://ollama.com/install.sh | sh # Start Ollama ollama serve # Pull a model ollama pull qwen2.5-coder:7b ``` ### Installation ```bash # Clone the repository git clone https://github.com/luna-system/ada.git cd ada # Run setup wizard python3 ada_cli.py setup # Or manual setup: python3 -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -e . cp .env.example .env ``` ### Running Ada ```bash # Activate virtual environment source .venv/bin/activate # Start Ada (auto-detects local Ollama) ada run # Or explicitly use local mode ada run --local # Development mode (hot reload) ada run --local --dev ``` Ada will: - ✅ Detect your local Ollama installation - ✅ Use embedded ChromaDB (no separate database process) - ✅ Start the brain API on http://localhost:7000 - ✅ Store all data in `./data/` directory ### Quick Commands ```bash # Check system health ada doctor # View running services ada status # Chat directly from terminal ada chat "What's the weather today?" # Stop Ada ada stop ``` ## Architecture: Local Mode ``` Terminal Local Ollama (port 11434) │ │ ├─ ada run ────────────────┤ │ │ └─ Brain API ───────────────┘ (port 7000) │ └─ Embedded ChromaDB (./data/chroma/) ``` **No Docker containers!** Everything runs as local processes. ## Comparison: Local vs Docker | Feature | Local Mode | Docker Mode | |---------|-----------|-------------| | **Setup Time** | < 5 minutes | 10-15 minutes | | **Memory Usage** | ~2GB (Ollama + Brain) | ~4GB (all containers) | | **Startup Time** | < 5 seconds | ~30 seconds | | **Model Storage** | Single copy | Separate container volume | | **GPU Support** | Native (CUDA/ROCm/Metal) | Requires GPU passthrough | | **Development** | Hot reload built-in | Requires rebuild | | **Portability** | Requires Python/Ollama | Requires Docker | | **Data Backup** | Simple `./data/` folder | Container volumes | **Recommendation:** Use local mode unless you need multi-service orchestration. ## Configuration ### .env for Local Mode ```bash # Minimal local configuration OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_MODEL=qwen2.5-coder:7b CHROMA_MODE=embedded DATA_DIR=./data ``` ### Advanced Options ```bash # Use different port ada run --local --port 8080 # Bind to specific interface ada run --local --host 127.0.0.1 # Development mode (auto-reload) ada run --local --dev ``` ## Features ### All Ada features work in local mode: - ✅ **RAG (Retrieval-Augmented Generation)** - Embedded ChromaDB - ✅ **Conversation Memory** - Stored in `./data/chroma/` - ✅ **Streaming Responses** - Full SSE support - ✅ **Specialists** - All specialists work (OCR, web search, wiki, etc.) - ✅ **Matrix Bridge** - Can connect to local brain - ✅ **CLI Adapter** - Works out of the box - ✅ **Web UI** - Use nginx proxy or run Astro dev server ### What requires Docker: - ❌ **Web UI** (nginx) - Optional, can use Astro dev server instead - ❌ **Matrix Bridge** - Optional, requires containerized setup - ❌ **Automated backups** - Optional maintenance scripts ## Accessing the API ### From Browser ``` http://localhost:7000/v1/healthz ``` ### Using ada-cli ```bash # Interactive mode ada-cli # One-shot query ada-cli "Tell me about Python" # JSON output ada-cli --json "List 3 facts about Mars" ``` ### Using curl ```bash curl -X POST http://localhost:7000/v1/chat/stream \ -H "Content-Type: application/json" \ -d '{"prompt":"Hello Ada!"}' \ --no-buffer ``` ## Data Management ### Data Directory Structure ``` ./data/ ├── chroma/ # Embedded vector database │ ├── personas/ # Persona documents │ ├── faqs/ # FAQ collection │ ├── memories/ # Long-term memories │ └── turns/ # Conversation history ├── brain/ # Brain service data │ ├── notices.json # System notifications │ └── uploads/ # Uploaded files └── backups/ # Optional manual backups ``` ### Backing Up ```bash # Simple backup tar -czf ada-backup-$(date +%Y%m%d).tar.gz data/ # Restore tar -xzf ada-backup-20250116.tar.gz ``` ### Migrating from Docker ```bash # 1. Stop Docker services docker compose down # 2. Copy data from Docker volumes docker run --rm -v ada-v1_chroma-data:/data -v $(pwd)/data:/backup \ alpine tar -czf /backup/chroma.tar.gz -C /data . # 3. Extract to local data directory mkdir -p data/chroma tar -xzf data/chroma.tar.gz -C data/chroma/ # 4. Update .env for local mode cp .env.example .env # Edit .env - use local mode configuration # 5. Start local mode ada run --local ``` ## Troubleshooting ### "Ollama not running" **Solution:** ```bash # Check if Ollama is running curl http://localhost:11434/api/tags # If not, start it ollama serve # Check if model is available ollama list ollama pull qwen2.5-coder:7b # If model missing ``` ### "ModuleNotFoundError: No module named 'brain'" **Solution:** ```bash # Ensure you're in virtual environment source .venv/bin/activate # Reinstall in editable mode pip install -e . # Verify installation ada --version ``` ### "Port 7000 already in use" **Solution:** ```bash # Find process using port lsof -i :7000 # Kill it kill -9 # Or use different port ada run --local --port 8000 ``` ### "Permission denied: ./data/chroma" **Solution:** ```bash # Fix permissions chmod -R 755 data/ # Or run with sudo (not recommended) sudo ada run --local ``` ### Ollama connection refused from brain **Issue:** Brain can't reach Ollama even though it's running. **Solution:** ```bash # Check Ollama is listening on all interfaces (not just 127.0.0.1) ss -tlnp | grep 11434 # If only on 127.0.0.1, restart with: OLLAMA_HOST=0.0.0.0:11434 ollama serve ``` ### ChromaDB errors **Solution:** ```bash # Clear ChromaDB data (WARNING: loses history!) rm -rf data/chroma/ # Restart Ada ada run --local ``` ## Performance Tips ### Memory Optimization ```bash # Use smaller model for faster responses ada run --local # Edit .env: OLLAMA_MODEL=qwen2.5-coder:3b # Reduce context window RAG_TURN_TOP_K=2 RAG_MEMORY_TOP_K=1 ``` ### CPU-Only Performance If you don't have a GPU, use optimized models: ```bash ollama pull qwen2.5-coder:3b OLLAMA_MODEL=qwen2.5-coder:3b ada run --local ``` ### GPU Acceleration Ada automatically uses GPU if available: - **NVIDIA**: CUDA toolkit (automatically detected by Ollama) - **AMD**: ROCm (Linux only) - **Apple**: Metal (macOS M1/M2) No configuration needed - Ollama handles it! ## Development Workflow ### Hot Reload ```bash # Start in development mode ada run --local --dev # Edit brain/*.py files # Changes apply immediately (no restart needed) ``` ### Running Tests ```bash # Unit tests pytest tests/ # Integration tests (requires brain running) ada run --local & pytest tests/test_integration.py ``` ### Adding Specialists ```bash # 1. Create specialist vim brain/specialists/my_specialist.py # 2. Register in __init__.py vim brain/specialists/__init__.py # 3. Restart brain (dev mode auto-reloads) # Changes take effect immediately ``` ## Hybrid Setups ### Local Brain + Docker Services Run brain locally but use Docker for optional services: ```bash # Start only web UI and matrix bridge docker compose up -d frontend matrix-bridge # Run brain locally ada run --local # Access web UI at http://localhost:5000 ``` ### Local Development + Docker Ollama Use Docker for Ollama isolation: ```bash # Start Ollama in Docker docker compose up -d ollama # Run brain locally pointing to Docker Ollama OLLAMA_BASE_URL=http://localhost:11434 ada run --local ``` ## Next Steps - **Web UI:** See [docs/adapters.rst](../docs/adapters.rst) for Astro dev server - **Matrix Bot:** See [matrix-bridge/QUICKSTART.md](../matrix-bridge/QUICKSTART.md) - **Custom Persona:** Edit `persona.md` and restart - **Specialists:** See [docs/specialists.rst](../docs/specialists.rst) ## Why Local Mode? ### Benefits 1. **Faster startup** - No container orchestration 2. **Less memory** - One Ollama instance shared across tools 3. **Native GPU** - Direct hardware access, no passthrough 4. **Easier development** - Hot reload, native debugging 5. **Simpler backups** - Just copy `./data/` folder 6. **Lower disk usage** - No duplicate model storage ### When to use Docker instead - **Production deployment** with multiple services - **Isolated environments** (testing, CI/CD) - **Teams** sharing consistent setup - **Remote deployment** without shell access ## Community & Support - **Issues:** https://github.com/luna-system/ada/issues - **Discussions:** https://github.com/luna-system/ada/discussions - **Matrix:** `#ada-assistant:matrix.org` --- **Made with 💜 by the Ada team** *Docker optional, autonomy essential.*