Code Completion

Copilot-style autocomplete in Neovim with Ada! (v2.6+)

Ada provides native code completion with full context awareness, running entirely on your local machine. Press <C-x><C-a> in insert mode to get intelligent code suggestions.

Overview 

What You Get:

10.6x faster than general models - Optimized for code-specific models with FIM format
Context-aware completion - Sees code before AND after cursor
Language-agnostic - Works with Python, JavaScript, Lua, Rust, Go, and more
Privacy-first - Runs entirely on your machine, no cloud dependencies
Copilot parity - Similar quality and speed to commercial offerings

Performance:

Mean latency: 2.6s
Quality score: 77%
Success rate: 100% across 24 test scenarios
Model size: 4.7GB (qwen2.5-coder:7b)

Quick Setup 

Time: 5 minutes

1. Pull the Code Model 

ollama pull qwen2.5-coder:7b

This specialized model is optimized for code completion with FIM (Fill-In-Middle) format support.

2. Install ada.nvim 

Add to your Neovim config:

Lazy.nvim:

{
  dir = "~/Code/ada-v1/ada.nvim",
  config = function()
    require("ada").setup({
      ada_url = "http://localhost:7000",
      completion = {
        enabled = true,
        model = "qwen2.5-coder:7b",
      },
    })
  end,
}

Packer.nvim:

use {
  "~/Code/ada-v1/ada.nvim",
  config = function()
    require("ada").setup({
      ada_url = "http://localhost:7000",
      completion = {
        enabled = true,
        model = "qwen2.5-coder:7b",
      },
    })
  end,
}

3. Test It!

cd ~/Code/ada-v1/ada.nvim
./test.sh

Open a code file and press <C-x><C-a> in insert mode.

Usage 

Basic Completion 

Open a code file in Neovim
Type some code and position your cursor
Press <C-x><C-a> in insert mode
Wait ~2-3 seconds for the completion
Accept or reject the suggestion

The completion appears in a floating window with:

Completed code
Language detection
Latency information

Advanced Features 

Context Awareness:

Ada sees code both before AND after your cursor, enabling:

Completing function bodies when signature exists
Filling in middle of loops with context from both sides
Smart import suggestions based on usage below

Language Support:

Tested and working with:

Python (functions, classes, error handling)
JavaScript/TypeScript (async/await, promises, React)
Lua (Neovim configs, functions)
Rust (lifetimes, traits, error handling)
Go (interfaces, error handling)
And more! (Any language the model knows)

Quality Scoring:

Each completion is scored on:

Syntax correctness (30%)
Context relevance (40%)
Completeness (30%)

Scores above 70% are considered high-quality.

Configuration 

Customize in your setup() call:

require("ada").setup({
  ada_url = "http://localhost:7000",
  completion = {
    enabled = true,
    model = "qwen2.5-coder:7b",  -- Which model to use
    max_tokens = 150,              -- Max completion length
    temperature = 0.2,             -- Lower = more focused
    timeout = 10000,               -- 10s timeout
  },
})

Available Models:

qwen2.5-coder:7b (recommended) - 4.7GB, fast and accurate
qwen2.5-coder:3b - 2.0GB, faster but less accurate
qwen2.5-coder:14b - 8.9GB, slower but more capable
deepseek-coder-v2 - Alternative, similar performance

Custom Keybinding:

Change the default <C-x><C-a>:

vim.keymap.set('i', '<C-space>', function()
  require('ada.completion').complete()
end, { desc = 'Ada code completion' })

Architecture 

How It Works 

Cursor position detected - Neovim plugin captures context
Code extracted - Before cursor (prefix) and after cursor (suffix)
Language detected - From filetype (e.g., python, javascript)
MCP tool invoked - complete_code tool in ada-mcp
Ollama queried directly - Bypasses RAG overhead for speed
FIM format used - Model fills in the middle between prefix and suffix
Result returned - Completion appears in floating window

Data Flow:

Neovim → ada.nvim → MCP complete_code → Ollama → qwen2.5-coder
                                                  ↓
                                                FIM format
                                                  ↓
                                       Completion ← Model

Why It’s Fast:

Direct Ollama access - Skips RAG, prompt building, specialists
FIM format - Code models trained specifically for this task
Low temperature - Focused output, less token generation
Smart max tokens - 150 tokens balance quality and speed

Performance 

Benchmarks (24 Test Cases)

Comparing a general-purpose model (example: DeepSeek-R1) to a specialized code model (Qwen2.5-Coder):

Metric	DeepSeek-R1	Qwen2.5-Coder	Improvement
Mean Latency	27.7s	2.6s	10.6x faster
Median Latency	19.3s	3.0s	6.4x faster
Best Case	12.8s	396ms	32x faster
Worst Case	1049s	3.7s	283x faster
Success Rate	100%	100%	✅ Maintained

Quality Scores:

Mean: 77% (high-quality)
Range: 65-85% across test scenarios
Syntax: 92% correctness
Context relevance: 81%
Completeness: 72%

Real-World Examples 

See ada.nvim/COMPLETION_QUICKSTART.md for 13 detailed examples including:

Python function completion
JavaScript async/await
Lua Neovim configs
Error handling patterns
Loop completion
Class methods
And more!

Troubleshooting 

Completion Not Appearing 

Check Ada is running:

curl http://localhost:7000/v1/healthz

Check model is installed:

ollama list | grep qwen2.5-coder

Check Neovim logs:

:messages

Slow Completions 

Use smaller model:

ollama pull qwen2.5-coder:3b

Update config:

completion = {
  model = "qwen2.5-coder:3b",
}

Check GPU:

ollama ps  # Should show GPU usage

See Hardware & GPU Guide for GPU setup.

Poor Quality Completions 

Use larger model:

ollama pull qwen2.5-coder:14b

Check context:

Make sure you have meaningful code before and after the cursor. Completion works best with:

Clear function signatures
Established patterns in surrounding code
Comments describing intent

Adjust temperature:

completion = {
  temperature = 0.1,  -- More focused (default 0.2)
}

Error: MCP Tool Not Found 

Reinstall MCP tools:

cd ~/Code/ada-v1/ada-mcp
npm install
npm run build

Check MCP config:

Verify ada-mcp is in your editor’s MCP settings.

Comparison to Copilot 

What’s Similar:

Context-aware completion
Multi-language support
Inline suggestions
Quality and speed (with right model)

What’s Different:

Fully local - No cloud dependencies, runs on your hardware
Privacy-first - Your code never leaves your machine
Free - No subscription ($0 vs $10-20/month)
Hackable - Modify behavior, swap models, add features
Transparent - See exactly how it works

Trade-offs:

Requires local compute (GPU recommended)
Needs model download (4.7GB for qwen2.5-coder:7b)
Slightly slower on CPU (use GPU for best performance)
Quality depends on model choice (but very competitive!)

Next Steps 

Try different models - Experiment with qwen2.5-coder variants
Build custom prompts - Modify FIM format for your use case
Add to CI/CD - Use Ada for code review automation
Extend MCP tools - Add refactoring, documentation generation, etc.

See Development Tools for contributing!