Skip to main content

Ollama Integration

Use Buildable with Ollama for local AI assistance and MCP integration.

Overview

Ollama allows you to run large language models locally, and with Buildable's MCP integration, you can get AI assistance for your development tasks while keeping everything private and local.

Prerequisites

  • Ollama installed and running
  • Buildable account and project
  • MCP-compatible client with Ollama support

Setup

1. Install Ollama

Download and install Ollama from ollama.ai:

# macOS
brew install ollama

# Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Windows - download from website

2. Pull a Model

Choose and download a model:

# Recommended models for development
ollama pull codellama:13b
ollama pull mistral:7b
ollama pull llama2:13b

# Smaller models for faster responses
ollama pull codellama:7b
ollama pull mistral:latest

3. Start Ollama Server

ollama serve

The server will run on http://localhost:11434 by default.

MCP Configuration

Method 1: Direct Ollama + Buildable MCP

Configure your MCP client to use both Ollama and Buildable:

{
"mcpServers": {
"buildable": {
"command": "npx",
"args": ["@buildable/mcp-server"],
"env": {
"BUILDABLE_API_KEY": "your-api-key",
"BUILDABLE_PROJECT_ID": "your-project-id"
}
},
"ollama": {
"command": "npx",
"args": ["@ollama/mcp-server"],
"env": {
"OLLAMA_BASE_URL": "http://localhost:11434"
}
}
}
}

Method 2: Buildable MCP with Ollama Backend

Configure Buildable to use Ollama as the AI backend:

{
"mcpServers": {
"buildable": {
"command": "npx",
"args": ["@buildable/mcp-server"],
"env": {
"BUILDABLE_API_KEY": "your-api-key",
"BUILDABLE_PROJECT_ID": "your-project-id",
"AI_PROVIDER": "ollama",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_MODEL": "codellama:13b"
}
}
}
}

Supported Clients

Claude Desktop with Ollama

While Claude Desktop doesn't directly support Ollama, you can use Buildable's MCP server to bridge the connection:

{
"mcpServers": {
"buildable-ollama": {
"command": "npx",
"args": ["@buildable/mcp-server-ollama"],
"env": {
"BUILDABLE_API_KEY": "your-api-key",
"BUILDABLE_PROJECT_ID": "your-project-id",
"OLLAMA_MODEL": "codellama:13b"
}
}
}
}

Continue + Ollama

Configure Continue to use Ollama with Buildable context:

{
"models": [
{
"title": "Ollama CodeLlama",
"provider": "ollama",
"model": "codellama:13b",
"apiBase": "http://localhost:11434"
}
],
"mcpServers": {
"buildable": {
"command": "npx",
"args": ["@buildable/mcp-server"],
"env": {
"BUILDABLE_API_KEY": "your-api-key",
"BUILDABLE_PROJECT_ID": "your-project-id"
}
}
}
}

Features

Local AI Processing

  • Complete privacy - models run locally
  • No internet required for AI responses
  • Faster responses (with sufficient hardware)
  • No API costs for model usage

Buildable Integration

  • Access to your project tasks and context
  • Real-time task updates
  • Code generation based on current work
  • Project-aware suggestions

Model Flexibility

  • Choose models based on your hardware
  • Specialized models for different tasks
  • Easy model switching
  • Custom fine-tuned models

For Development Tasks

# Best for code generation and explanation
ollama pull codellama:13b # Better quality
ollama pull codellama:7b # Faster responses

Mistral

# Good general-purpose model
ollama pull mistral:7b
ollama pull mistral:latest

Llama 2

# Strong reasoning capabilities
ollama pull llama2:13b
ollama pull llama2:7b

For Specific Use Cases

Code Review

  • codellama:13b - Best for detailed code analysis
  • mistral:7b - Good for quick reviews

Documentation

  • llama2:13b - Excellent for writing docs
  • mistral:latest - Good for technical writing

Debugging

  • codellama:13b - Superior debugging assistance
  • codellama:7b - Faster debugging help

Performance Tuning

Hardware Requirements

Minimum (7B models)

  • 8GB RAM
  • 4-core CPU
  • Integrated graphics OK
  • 16GB RAM
  • 8-core CPU
  • Dedicated GPU (optional but helpful)

Optimal (34B+ models)

  • 32GB+ RAM
  • High-end CPU
  • GPU with 16GB+ VRAM

Ollama Configuration

Create ~/.ollama/config.json:

{
"num_ctx": 4096,
"num_predict": 512,
"temperature": 0.3,
"top_k": 40,
"top_p": 0.9
}

Memory Management

# Limit concurrent models
export OLLAMA_NUM_PARALLEL=1

# Set memory limits
export OLLAMA_MAX_LOADED_MODELS=1

# GPU memory settings (if applicable)
export OLLAMA_GPU_MEMORY_FRACTION=0.8

Usage Examples

Task-Aware Code Generation

You: "Generate a React component for the user profile task"
Ollama + Buildable: "Based on task #123 'Create user profile page', here's a React component that matches your project structure..."

Code Review with Context

You: "Review this authentication function"
Ollama + Buildable: "Looking at your security requirements from task #456, this function should also include..."

Project-Specific Help

You: "How should I implement the payment system?"
Ollama + Buildable: "Based on your project's tech stack (Next.js + Stripe) and current payment tasks, I recommend..."

Troubleshooting

Common Issues

Ollama Not Starting

# Check if Ollama is running
ollama list

# Restart Ollama
ollama serve

Model Loading Errors

# Check available models
ollama list

# Pull model if missing
ollama pull codellama:7b

MCP Connection Issues

  • Verify Ollama is accessible at http://localhost:11434
  • Check Buildable API credentials
  • Ensure MCP server can reach both services

Performance Issues

Slow Responses

  • Use smaller models (7B instead of 13B)
  • Reduce context window
  • Close other applications
  • Consider GPU acceleration

Memory Issues

  • Reduce num_ctx parameter
  • Use smaller models
  • Increase system RAM
  • Enable swap if needed

Best Practices

  1. Choose appropriate models based on your hardware
  2. Start with smaller models and upgrade as needed
  3. Monitor resource usage during development
  4. Use specific prompts that reference your tasks
  5. Keep Ollama updated for latest features

Security Benefits

Complete Privacy

  • All AI processing happens locally
  • No data sent to external services
  • Code never leaves your machine
  • Full control over model behavior

Compliance Friendly

  • Meets strict data requirements
  • No external dependencies for AI
  • Audit-friendly setup
  • Custom model training possible

Getting Help

  • Check Ollama documentation for model-specific issues
  • Review Buildable MCP setup guides
  • Join our Discord #ollama channel
  • Report integration bugs on GitHub