Testing Guide
This guide covers testing Ada’s Brain service, including manual testing, automated tests with pytest, and API endpoint testing.
For API endpoint documentation, see API Reference. For configuration options, see Configuration Reference. For architecture overview, see Architecture.
Running Services for Testing
Option 1: Using Docker Compose (Recommended)
cd /home/luna/Code/ada-v1
docker compose up
Then open the UI in your browser at http://localhost:5000/.
Option 2: Local Development
If testing the FastAPI backend directly without Docker:
cd /home/luna/Code/ada-v1
source .venv/bin/activate
python -m uvicorn brain.app:app --host 0.0.0.0 --port 7000
The API will be available at http://localhost:7000/v1/* with interactive docs at http://localhost:7000/docs.
Manual Testing
Testing Markdown Rendering
With the frontend running (via Docker Compose or local dev server):
Try these messages in the chat composer and submit (note: inline markdown is rendered; block-level markdown/code fences are rendered with syntax highlighting):
Hello bold text
italic and bold together
A link: OpenAI
Inline code:
const a = 1;Code block:
function test() { return 42; }
Verify:
Messages render with bold/italic/links/inline code formatting
Links open in a new tab and have
rel="noopener noreferrer"setCode fences (
`...`) render with a monospace, sanitized<pre><code>block and basic syntax highlightingThe ‘Thinking’ bubble (if enabled) renders inline markdown similarly
If markdown does not render, check the small header status next to the brand for
marked:✓ DOMPurify:✓ hljs:✓— these indicate the client-side libraries loaded successfullyIf any are missing (✕), open DevTools console to see errors and clear browser cache (or refresh with Ctrl/Cmd+Shift+R)
Optional: Check that the chat input and memory list still work as before.
Testing API Endpoints Directly
To test API endpoints without the frontend:
# Health check
curl http://localhost:7000/v1/healthz
# Get recent conversations
curl http://localhost:7000/v1/conversations/recent
# Query memory
curl "http://localhost:7000/v1/memory?query=example"
# Create a memory
curl -X POST http://localhost:7000/v1/memory \
-H "Content-Type: application/json" \
-d '{"content": "Test memory", "memory_type": "important"}'
# Interactive API docs
# Open http://localhost:7000/docs in your browser
For complete endpoint documentation, see API Reference and API Usage Guide. For code examples, see Examples.
Testing with Frontend Proxy
All API calls from the frontend go through the Nginx reverse proxy:
# Same endpoints available via proxy
curl http://localhost:5000/api/health
curl http://localhost:5000/api/conversations/recent
# Stream endpoint (SSE)
curl http://localhost:5000/api/chat/stream -X POST \
-H "Content-Type: application/json" \
-d '{"message": "Hello", "conversation_id": "test"}'
The proxy remaps /api/* → /v1/* on the brain service with proper SSE handling.
Automated Testing with Pytest
Ada uses a hybrid testing approach combining traditional example-based tests with property-based testing using Hypothesis. All tests are located in the tests/ directory and run in a dedicated Docker container for consistency.
Test Infrastructure
The testing infrastructure includes:
Tests Container: Dedicated Docker service with Python 3.13 and all dependencies
Pytest Configuration:
pytest.iniwith sensible defaultsFixtures: Shared test fixtures in
tests/conftest.pyand feature-specificconftest.pyfilesHypothesis: Property-based testing for algorithmic correctness
Convenience Script:
scripts/run.shwrapper for common test commandsOrganized Structure: Tests grouped by feature and type (traditional vs property-based)
Running Tests
Quick Way (Recommended)
# Run all tests
./scripts/run.sh test
# Run with verbose output
docker compose run --rm scripts pytest -vv
# Run specific test file
docker compose run --rm scripts pytest tests/test_rag.py
# Run tests matching a pattern
docker compose run --rm scripts pytest -k "memory"
# Skip slow tests
docker compose run --rm scripts pytest -m "not slow"
Direct Way (Full Control)
# Run all tests
docker compose run --rm scripts pytest
# Run specific file with verbose output
docker compose run --rm scripts pytest tests/test_prompt_builder.py -vv
# Run only async tests
docker compose run --rm scripts pytest -k "async"
Test Structure
The test suite uses an organized directory structure:
tests/
conftest.py # Global fixtures (rag_store, conversation_id)
prompt_builder/ # Traditional example-based tests
conftest.py # Shared fixtures for prompt_builder
test_context_retriever.py # Context retrieval from RAG (14 tests)
test_section_builder.py # Prompt section formatting
test_prompt_assembler.py # Final prompt assembly
property/ # Property-based tests (Hypothesis)
conftest.py # Hypothesis configuration
test_token_properties.py # Token counting invariants (11 tests)
test_memory_properties.py # Memory decay, ranking (v2.0)
test_rag.py # RAG store tests (6 tests)
test_specialists.py # Specialist system tests
test_ai_documentation.py # Documentation validation tests
- Traditional Tests (example-based):
Test specific API behavior, integration workflows, and regression cases. Use parametrization to reduce duplication.
- Property Tests (Hypothesis):
Test mathematical properties that should ALWAYS hold (bounds, monotonicity, etc.). Hypothesis generates 100+ random test cases automatically.
Test Types
Traditional Tests verify specific behavior:
def test_context_retriever_initialization():
"""ContextRetriever initializes correctly."""
retriever = ContextRetriever()
assert retriever is not None
@pytest.mark.parametrize("method,kwargs,expected_len", [
("get_memories", {"query": "test", "k": 5}, 2),
("get_faqs", {"query": "test", "k": 3}, 2),
])
def test_retrieval_methods(retriever, method, kwargs, expected_len):
"""Test multiple similar methods with one parametrized test."""
method = getattr(retriever, method)
result = method(**kwargs)
assert len(result) == expected_len
Property Tests verify universal invariants:
from hypothesis import given, strategies as st, example
@given(st.text(min_size=1, max_size=10000))
@example("🎵" * 100) # Always test this edge case
def test_positive_token_count(text):
"""Non-empty text ALWAYS produces positive tokens."""
monitor = TokenBudgetMonitor()
tokens = monitor.count_tokens(text)
assert tokens > 0
Hypothesis will generate random inputs to try to falsify your assertions!
Adding New Tests
When to use which pattern:
- Traditional Tests (example-based)
✅ Specific API behavior
✅ Integration tests
✅ Regression tests for known bugs
✅ Business logic and workflows
- Property Tests (Hypothesis)
✅ Mathematical invariants
✅ Algorithmic properties
✅ Edge case discovery
✅ Universal constraints
Example: Traditional Test
# tests/prompt_builder/test_my_feature.py
import pytest
from brain.my_module import my_function
def test_my_feature(rag_store):
"""Test description."""
result = my_function(rag_store)
assert result == expected_value
assert len(result) > 0
@pytest.mark.parametrize("input,expected", [
("hello", 2),
("hello world", 3),
])
def test_multiple_cases(input, expected):
"""Test multiple similar cases with parametrization."""
result = my_function(input)
assert result == expected
Example: Property Test
# tests/property/test_my_properties.py
from hypothesis import given, strategies as st, example
@given(st.text(min_size=1, max_size=1000))
@example("edge case") # Always test specific cases
def test_universal_property(text):
"""This should ALWAYS be true for ANY input."""
result = my_function(text)
assert result > 0 # Non-empty input always produces positive result
assert isinstance(result, int) # Result is always an integer
No rebuild needed! Tests are volume-mounted, so you can ad
# Run only property tests docker compose run –rm scripts pytest tests/property/
# Run only traditional tests for a feature docker compose run –rm scripts pytest tests/prompt_builder/
Hypothesis Configuration
Property tests use different profiles for different environments:
# Local development (100 examples, fast)
pytest tests/property/
# CI/thorough testing (1000 examples)
HYPOTHESIS_PROFILE=ci pytest tests/property/
# Debugging (10 examples, verbose)
HYPOTHESIS_PROFILE=debug pytest tests/property/ -v
Configuration is in tests/property/conftest.py.d/edit tests and run immediately.
Test Markers
Use markers to categorize tests:
@pytest.mark.slow
def test_expensive_operation():
"""This test takes a while."""
pass
@pytest.mark.integration
def test_full_workflow():
"""Tests multiple components together."""
pass
@pytest.mark.unit
def test_isolated_function():
"""Tests a single function."""
pass
Run specific markers:
# Skip slow tests
docker compose run --rm scripts pytest -m "not slow"
# Run only integration tests
docker compose run --rm scripts pytest -m "integration"
Health Check Script
Comprehensive health check for operational validation (not a test):
./scripts/run.sh health
# Or directly:
docker compose run --rm scripts python /app/scripts/health_check_chroma.py
Checks:
Chroma server connectivity
Collection existence and document count
Embedding generation
Memory/FAQ/turn query functionality
Specialist docs retrieval
Query consistency
Persona loading
Exit codes: 0 (healthy), 1 (unhealthy)
CI/CD Integration
The pytest infrastructure is designed for easy CI/CD integration:
GitLab CI Example
# .gitlab-ci.yml
test:
script:
- docker compose build scripts
- docker compose run --rm scripts pytest
- docker compose run --rm scripts python /app/scripts/health_check_chroma.py
GitHub Actions Example
# .github/workflows/test.yml
name: Test
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run tests
run: |
docker compose build scripts
docker compose run --rm scripts pytest
- name: Health check
run: docker compose run --rm scripts python /app/scripts/health_check_chroma.py
Pre-Deployment Health Check
#!/bin/bash
# pre-deploy.sh
./scripts/run.sh health || exit 1
./scripts/run.sh test || exit 1
if [ $? -eq 0 ]; then
echo "All checks passed, proceeding with deployment"
docker compose up -d
else
echo "Checks failed, aborting deployment"
exit 1
fi
Interactive Debugging
Start an interactive Python session with all dependencies:
./scripts/run.sh shell
# Or directly:
docker compose run --rm scripts python
Then in Python:
>>> from brain.rag_store import RagStore
>>> from brain import config
>>>
>>> store = RagStore()
>>> results = store.retrieve_memories("test query", k=5)
>>> print(results)
>>>
>>> # Test embedding generation
>>> embedding = store.embed_text("sample text")
>>> print(len(embedding)) # Should be 768
Test Coverage Priorities
Current Coverage
✅ Well-Covered:
RAG retrieval (6 tests)
Prompt building (2 tests)
Specialist system (1 test)
⚠️ Needs Coverage:
API endpoints (streaming, non-streaming)
Error handling (Chroma down, Ollama down)
Edge cases (invalid input, timeouts)
Specialist execution
Memory consolidation
Recommended Next Tests
High Priority:
API endpoint tests (~10 tests needed)
Test
/v1/chat/streamendpointTest
/v1/chat/completionsendpointTest health check endpoint variations
Test error responses (400, 500, 503)
Error handling tests (~5-10 tests needed)
Test Chroma down scenarios
Test Ollama down scenarios
Test invalid input handling
Test timeout handling
Medium Priority:
Specialist system tests
Test specialist activation logic
Test specialist execution
Test bidirectional specialist communication
Memory consolidation tests
Test nightly consolidation logic
Test summary generation
Test memory archival
Example API Endpoint Test
# tests/test_api.py
import pytest
import httpx
import json
@pytest.mark.asyncio
async def test_chat_stream_endpoint():
"""Test streaming chat endpoint."""
async with httpx.AsyncClient() as client:
response = await client.post(
"http://localhost:7000/v1/chat/stream",
json={"messages": [{"role": "user", "content": "hi"}]}
)
assert response.status_code == 200
chunks = []
async for line in response.aiter_lines():
if line.startswith("data: "):
chunks.append(json.loads(line[6:]))
assert len(chunks) > 0
@pytest.mark.asyncio
async def test_handles_chroma_down(monkeypatch):
"""Test graceful handling when Chroma is unavailable."""
def mock_fail(*args, **kwargs):
raise ConnectionError("Chroma unavailable")
monkeypatch.setattr("chromadb.HttpClient", mock_fail)
# Test that API returns 503 instead of crashing
async with httpx.AsyncClient() as client:
response = await client.get("http://localhost:7000/v1/healthz")
assert response.status_code == 503
Troubleshooting Tests
Tests Fail Locally but Pass in Container
Solution: Always run tests in the scripts container, not on your host machine. The container has the correct environment:
# ✅ Correct
./scripts/run.sh test
# ❌ Wrong
pytest # Don't run on host
Import Errors
Problem: ModuleNotFoundError: No module named 'brain'
Solution: Use the scripts container. The PYTHONPATH is configured correctly:
docker compose run --rm scripts pytest
Can’t Reach Services
Problem: httpx.ConnectError: Connection refused
Solution:
Ensure services are running:
docker compose psUse service names (chroma, ollama), not localhost
Check environment variables in
compose.yaml
Test Discovery Issues
Problem: Pytest doesn’t find your tests
Solution:
Name files
test_*.pyor*_test.pyName functions
test_*Place files in
tests/directoryCheck
pytest.inifortestpathsconfiguration
Best Practices
Always use scripts container: Consistent environment across machines
Exit codes matter: Tests should return 0 (success) or 1 (failure)
Use fixtures: Share common setup via
conftest.pyTest in isolation: Each test should be independent
Mock external services: Use
pytest-mockormonkeypatchfor external APIsDocument test purpose: Clear docstrings for each test function
Run tests before commits: Catch issues early
Add tests for bug fixes: Prevent regression