Specialist RAG Documentation System
Overview
Instead of using static specialist instructions in the system prompt, Ada now uses RAG-based dynamic documentation that retrieves relevant specialist guidance based on the user’s query context.
How It Works
1. Automatic Sync on Startup
When the brain service starts:
[BRAIN] Discovered 2 specialists: 🎧 media, 📄 ocr
[BRAIN] Synced 8 specialist FAQ entries to RAG
The system:
Discovers all registered specialists via the plugin registry
Generates FAQ entries from specialist capabilities
Stores them in Chroma with
type="faq"andtopic="specialists"Removes old entries to ensure idempotency
2. Context-Aware Retrieval
During prompt building (PromptAssembler in brain/prompt_builder/ package with caching):
User’s query is embedded
RAG retrieves the top K most relevant specialist FAQs (with caching for frequent queries)
Retrieved docs are injected into the prompt before specialist execution
This provides just-in-time specialist guidance instead of static instructions
3. FAQ Entry Types
The system generates multiple FAQ types:
Overview FAQ:
Q: What specialist capabilities are available?
A: Ada has 2 specialist capabilities integrated: 🎧 media, 📄 ocr.
These can be invoked mid-conversation using SPECIALIST_REQUEST[name:{params}] syntax...
Per-Specialist Capability FAQ:
Q: What does the ocr specialist do?
A: Extract text from images using Tesseract OCR (Priority: HIGH, Icon: 📄)
Per-Specialist Syntax FAQ:
Q: How do I invoke the ocr specialist?
A: Use the syntax SPECIALIST_REQUEST[ocr:{}] in your response. I will detect this pattern,
pause generation, execute the specialist, and resume with enriched context.
Usage Pattern FAQs:
When to use specialists
Chaining multiple specialists
How pause/resume works
Configuration
Enable/Disable RAG Docs
SPECIALIST_RAG_DOCS=true # Use dynamic RAG retrieval (default)
SPECIALIST_RAG_DOCS=false # Use static SPECIALIST_INSTRUCTIONS only
Set in brain/config.py:
SPECIALIST_RAG_DOCS = os.getenv("SPECIALIST_RAG_DOCS", "true").lower() == "true"
Retrieval Count
In the prompt building system, specialist docs retrieve 2 FAQs by default:
specialist_docs = get_relevant_specialist_docs(user_prompt, rag_store, k=2)
Benefits
1. Context-Aware Guidance
User asks “can you analyze this image?” → OCR/vision docs retrieved
User asks “what’s playing?” → Media specialist docs retrieved
Irrelevant specialists don’t clutter the prompt
2. Automatic Updates
Add a new specialist → FAQ entries auto-generated on next startup
Modify specialist capability → Updated in RAG automatically
No manual prompt engineering required
3. Token Efficiency
Static
SPECIALIST_INSTRUCTIONS= ~300 tokens always presentDynamic RAG retrieval = ~100-200 tokens only when relevant
Reduces prompt bloat for non-specialist queries
4. Semantic Matching
User query: “What can you see in this photo?”
RAG retrieves: OCR + vision specialist documentation
LLM learns specialist syntax contextually
Implementation Files
brain/specialists/specialist_docs.py
sync_specialist_docs_to_faq()- Generate FAQ entries from all specialistsget_relevant_specialist_docs()- Retrieve relevant docs for a query
brain/app.py (lifespan)
# Sync specialist documentation to FAQ system
if rag_store is not None:
doc_count = sync_specialist_docs_to_faq(rag_store)
print(f"[BRAIN] Synced {doc_count} specialist FAQ entries to RAG")
brain/prompt_builder/ (modular package) brain/_legacy_prompt_builder.py (legacy compatibility) ———————–
# --- Dynamic Specialist Documentation (RAG-based) ---
if SPECIALIST_RAG_DOCS and rag_store is not None:
specialist_docs = get_relevant_specialist_docs(user_prompt, rag_store, k=2)
if specialist_docs:
sections.append(specialist_docs)
used_context['specialist_docs'] = True
Example Workflow
User Query: “Can you read the text in this image?”
Prompt Building Phase:
Embed query: “Can you read the text in this image?”
RAG retrieves from
type="faq", topic="specialists":“How do I invoke the ocr specialist?”
“What does the ocr specialist do?”
Format and inject into prompt:
📚 Specialist Capabilities Reference: • Use the syntax SPECIALIST_REQUEST[ocr:{}]... • Extract text from images using Tesseract OCR...
LLM Generation:
Ada sees relevant OCR documentation in context
Generates: “I can extract the text using OCR. SPECIALIST_REQUEST[ocr:{}]”
Specialist Execution (Pause/Resume):
Generation pauses
OCR specialist extracts: “Annual Report 2024…”
Generation resumes with OCR result injected
Final Response:
“The image contains: Annual Report 2024…”
Future Enhancements
Phase 1: Embedding-Based Specialist Discovery
Instead of the LLM explicitly requesting specialists, the system could:
Embed user query
Check similarity to specialist capabilities
Auto-suggest specialists in prompt
Phase 2: Example-Based Learning
Store successful specialist invocations as FAQs:
Q: User uploaded a diagram and asked "what's the architecture?"
A: I used SPECIALIST_REQUEST[vision:{"focus":"architecture"}] to analyze...
Phase 3: Failure Case Documentation
Track failed specialist calls and add FAQ warnings:
Q: Can OCR read handwritten text?
A: OCR works best with printed text. Handwritten text may have lower accuracy.
Debugging
Check Synced FAQs
# View all specialist FAQs
docker exec ada-v1-brain-1 python -c "
from rag_store import RagStore
store = RagStore()
result = store.col.query(
query_texts=['specialists'],
n_results=10,
where={'type': 'faq', 'topic': 'specialists'}
)
for doc in result['documents'][0]:
print(doc)
print('---')
"
Test Retrieval
Query the debug endpoint (if RAG_DEBUG=true):
curl "http://localhost:7000/v1/debug/prompt?prompt=analyze%20image&faq_k=5"
Check used_context.specialist_docs field to see if docs were injected.
Monitor Logs
docker logs ada-v1-brain-1 | grep "Synced.*FAQ"
# [BRAIN] Synced 8 specialist FAQ entries to RAG
Configuration Reference
Variable |
Default |
Description |
|---|---|---|
|
|
Enable dynamic RAG-based specialist documentation |
|
|
Enable pause/resume for specialist execution |
|
|
Maximum specialist calls per conversation |
|
|
Number of FAQ entries to retrieve (includes specialist docs) |
Migration Notes
Before (Static Instructions)
All specialist syntax in
SPECIALIST_INSTRUCTIONS(300+ tokens)Present in every prompt regardless of relevance
Manual updates required for new specialists
After (Dynamic RAG)
Specialist syntax retrieved on-demand from FAQ system
Only relevant specialists injected based on query
Automatic updates when specialists added/changed
Backward compatible: static instructions still present as fallback