Specialist System
The specialist plugin system provides extensible AI capabilities through a standardized interface. Drop a new *_specialist.py file into brain/specialists/ and it’s automatically discovered and integrated.
Note
Live Specialist List: See GET /v1/specialists for the current list of active specialists with their capabilities, activation triggers, and priority levels. This endpoint reflects runtime state including any custom specialists you’ve added.
Overview
Architecture
Core Components:
Protocol (
protocol.py): Defines theSpecialistinterface and base typesRegistry (
__init__.py): Auto-discovers and manages specialist pluginsSpecialists (
*_specialist.py): Individual capability modules
Data Flow
User Request → Registry.execute_for_context() → Specialists (filtered by should_activate)
↓
Parallel execution
↓
Results sorted by priority
↓
prompt_builder injects contexts
↓
LLM receives enriched prompt
Creating a New Specialist
Step 1: Create the File
Create brain/specialists/your_name_specialist.py:
from .protocol import (
BaseSpecialist,
SpecialistCapability,
SpecialistResult,
SpecialistPriority
)
class YourSpecialist(BaseSpecialist):
"""Your specialist description."""
def __init__(self):
capability = SpecialistCapability(
name="your_name",
description="What your specialist does",
version="1.0.0",
context_priority=SpecialistPriority.MEDIUM,
context_icon="🔧",
tags=["tag1", "tag2"],
)
super().__init__(capability)
def should_activate(self, request_context: dict) -> bool:
"""Return True if your specialist should process this request"""
# Check request_context for relevant data
return request_context.get('your_key') is not None
async def process(self, **kwargs) -> SpecialistResult:
"""Execute your specialist logic"""
your_data = kwargs.get('your_key')
if not your_data:
return self.error_result("Missing data", "missing_data")
try:
# Your processing logic here
result_text = f"Processed: {your_data}"
# Format for LLM
context_text = self.format_context(
title="Your Specialist Output",
content=result_text,
metadata={'key': 'value'}
)
return self.success_result(
context_text=context_text,
data={'raw': your_data},
metadata={'processed': True}
)
except Exception as e:
return self.error_result(f"Error: {e}", "processing_error")
Step 2: That’s It!
The registry automatically discovers your specialist on next startup. No registration code needed.
Discovering Available Specialists
The specialist system is self-documenting! Query the introspection endpoint:
curl http://localhost:5000/api/specialists | jq
Returns for each specialist:
Name and description
Icon (for UI display)
Version
Priority level (critical/high/medium/low)
Enabled status
Tags for categorization
Input/output JSON schemas
Example Response:
{
"specialists": [
{
"name": "web_search",
"description": "Search the web for current information",
"icon": "🔍",
"version": "1.0.0",
"priority": "high",
"enabled": true,
"tags": ["web", "search", "current-events"],
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
],
"count": 3
}
Currently Available Specialists
As of this writing, Ada includes:
OCR (📄) - Text extraction from images via Tesseract
Media (🎧) - ListenBrainz music context integration
Web Search (🔍) - Real-time web search via SearxNG
Use GET /v1/specialists for the current list and their schemas.
Request Context Structure
When specialists execute, they receive a request_context dict:
{
'prompt': str, # User's message
'conversation_id': str, # Current conversation
'entity': str | None, # Optional entity filter
'media': dict | None, # ListenBrainz data
'ocr_context': dict | None, # OCR extraction result
'user_timestamp': str, # Request timestamp
# ... extensible
}
Specialists check this context in should_activate() to determine if they’re relevant.
Priority System
Controls injection order in the prompt:
Priority |
Value |
Usage |
|---|---|---|
CRITICAL |
0 |
System notices, identity |
HIGH |
10 |
User-provided context (OCR, vision) |
MEDIUM |
50 |
External data (media, APIs) |
LOW |
100 |
Supplementary info |
Lower numbers appear earlier in the prompt.
Advanced Topics
For detailed information on these advanced specialist features, see:
Bidirectional Specialist System - Bidirectional specialist system (LLM ↔ Specialist communication)
Web Search Specialist - Web search specialist configuration and usage
Specialist RAG Documentation System - RAG-based dynamic specialist documentation
Testing Specialists
Test Discovery
from brain.specialists import get_registry, list_specialists
registry = get_registry()
specialists = list_specialists()
for s in specialists:
print(f"{s.capability.name}: {s.capability.description}")
Test Execution
import asyncio
from brain.specialists import execute_specialists
context = {'ocr_context': {'text': 'Hello'}}
results = await execute_specialists(context)
for r in results:
print(f"{r.specialist_name}: {r.success}")
Future Enhancements
Phase 2: Pause & Resume
Currently, specialists inject mid-stream. For better quality:
Detect specialist request
Pause LLM generation
Execute specialist
Build new prompt with result
Resume generation from enriched context
Phase 3: Multi-turn Planning
Enable agentic workflows:
User: Analyze these 3 images
LLM: I'll analyze each image systematically.
Image 1: SPECIALIST_REQUEST[vision:{"image_id":1}]
[Result 1...]
Image 2: SPECIALIST_REQUEST[vision:{"image_id":2}]
[Result 2...]
Comparing the three images, I notice...
Phase 4: Function Calling API
If the selected LLM gains native tool calling:
# Auto-generate tool definitions from specialists
tools = [specialist.capability.to_openai_tool() for s in specialists]
response = ollama.chat(
model="qwen2.5-coder:7b",
messages=messages,
tools=tools
)
Service-Based Specialists
For GPU-intensive models (Phi-3.5-Vision), the same protocol supports remote specialists:
class RemoteVisionSpecialist(BaseSpecialist):
async def process(self, **kwargs):
# Call separate service via HTTP/gRPC
response = await http_client.post('http://vision-service:8080/analyze', ...)
return self.success_result(...)
No changes to registry or prompt_builder needed!
Best Practices
Single Responsibility: Each specialist does one thing well
Fail Gracefully: Return error results, don’t raise exceptions
Metadata Rich: Include useful metadata for debugging/logging
Format Consistently: Use
format_context()for prompt injectionDocument Activation: Clearly state when specialist activates
Version Carefully: Bump version on breaking changes
Debugging
Check Synced FAQs
# View all specialist FAQs
docker exec ada-v1-brain-1 python -c "
from brain.rag_store import RagStore
store = RagStore()
result = store.col.query(
query_texts=['specialists'],
n_results=10,
where={'type': 'faq', 'topic': 'specialists'}
)
for doc in result['documents'][0]:
print(doc)
print('---')
"
Test Retrieval
Query the debug endpoint (if RAG_DEBUG=true):
curl "http://localhost:7000/v1/debug/prompt?prompt=analyze%20image&faq_k=5"
Resources
See API Usage Guide for API integration details
See Testing Guide for specialist testing patterns
See Development Tools for adding new specialists