Ollama Integration

Use Buildable with Ollama for local AI assistance and MCP integration.

Overview

Ollama allows you to run large language models locally, and with Buildable's MCP integration, you can get AI assistance for your development tasks while keeping everything private and local.

Prerequisites

Ollama installed and running
Buildable account and project
MCP-compatible client with Ollama support

Setup

1. Install Ollama

Download and install Ollama from ollama.ai:

# macOS
brew install ollama

# Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Windows - download from website

2. Pull a Model

Choose and download a model:

# Recommended models for development
ollama pull codellama:13b
ollama pull mistral:7b
ollama pull llama2:13b

# Smaller models for faster responses
ollama pull codellama:7b
ollama pull mistral:latest

3. Start Ollama Server

ollama serve

The server will run on http://localhost:11434 by default.

MCP Configuration

Method 1: Direct Ollama + Buildable MCP

Configure your MCP client to use both Ollama and Buildable:

{
  "mcpServers": {
    "buildable": {
      "command": "npx",
      "args": ["@buildable/mcp-server"],
      "env": {
        "BUILDABLE_API_KEY": "<your_full_api_key_here>",
        "BUILDABLE_PROJECT_ID": "<your_project_id>",
        "BUILDABLE_AI_ASSISTANT_ID": "<assistant_name>"
      }
    },
    "ollama": {
      "command": "npx",
      "args": ["@ollama/mcp-server"],
      "env": {
        "OLLAMA_BASE_URL": "http://localhost:11434"
      }
    }
  }
}

Method 2: Buildable MCP with Ollama Backend

Configure Buildable to use Ollama as the AI backend:

{
  "mcpServers": {
    "buildable": {
      "command": "npx",
      "args": ["@buildable/mcp-server"],
      "env": {
        "BUILDABLE_API_KEY": "<your_full_api_key_here>",
        "BUILDABLE_PROJECT_ID": "<your_project_id>",
        "BUILDABLE_AI_ASSISTANT_ID": "<assistant_name>",
        "AI_PROVIDER": "ollama",
        "OLLAMA_BASE_URL": "http://localhost:11434",
        "OLLAMA_MODEL": "codellama:13b"
      }
    }
  }
}

Supported Clients

Claude Desktop with Ollama

While Claude Desktop doesn't directly support Ollama, you can use Buildable's MCP server to bridge the connection:

{
  "mcpServers": {
    "buildable-ollama": {
      "command": "npx",
      "args": ["@buildable/mcp-server-ollama"],
      "env": {
        "BUILDABLE_API_KEY": "<your_full_api_key_here>",
        "BUILDABLE_PROJECT_ID": "<your_project_id>",
        "BUILDABLE_AI_ASSISTANT_ID": "<assistant_name>",
        "OLLAMA_MODEL": "codellama:13b"
      }
    }
  }
}

Continue + Ollama

Configure Continue to use Ollama with Buildable context:

{
  "models": [
    {
      "title": "Ollama CodeLlama",
      "provider": "ollama",
      "model": "codellama:13b",
      "apiBase": "http://localhost:11434"
    }
  ],
  "mcpServers": {
    "buildable": {
      "command": "npx",
      "args": ["@buildable/mcp-server"],
      "env": {
        "BUILDABLE_API_KEY": "<your_full_api_key_here>",
        "BUILDABLE_PROJECT_ID": "<your_project_id>",
        "BUILDABLE_AI_ASSISTANT_ID": "<assistant_name>"
      }
    }
  }
}

Features

Local AI Processing

Complete privacy - models run locally
No internet required for AI responses
Faster responses (with sufficient hardware)
No API costs for model usage

Buildable Integration

Access to your project tasks and context
Real-time task updates
Code generation based on current work
Project-aware suggestions

Model Flexibility

Choose models based on your hardware
Specialized models for different tasks
Easy model switching
Custom fine-tuned models

Recommended Models

For Development Tasks

CodeLlama (Recommended)

# Best for code generation and explanation
ollama pull codellama:13b  # Better quality
ollama pull codellama:7b   # Faster responses

Mistral

# Good general-purpose model
ollama pull mistral:7b
ollama pull mistral:latest

Llama 2

# Strong reasoning capabilities
ollama pull llama2:13b
ollama pull llama2:7b

For Specific Use Cases

Code Review

codellama:13b - Best for detailed code analysis
mistral:7b - Good for quick reviews

Documentation

llama2:13b - Excellent for writing docs
mistral:latest - Good for technical writing

Debugging

codellama:13b - Superior debugging assistance
codellama:7b - Faster debugging help

Performance Tuning

Hardware Requirements

Minimum (7B models)

8GB RAM
4-core CPU
Integrated graphics OK

Recommended (13B models)

16GB RAM
8-core CPU
Dedicated GPU (optional but helpful)

Optimal (34B+ models)

32GB+ RAM
High-end CPU
GPU with 16GB+ VRAM

Ollama Configuration

Create ~/.ollama/config.json:

{
  "num_ctx": 4096,
  "num_predict": 512,
  "temperature": 0.3,
  "top_k": 40,
  "top_p": 0.9
}

Memory Management

# Limit concurrent models
export OLLAMA_NUM_PARALLEL=1

# Set memory limits
export OLLAMA_MAX_LOADED_MODELS=1

# GPU memory settings (if applicable)
export OLLAMA_GPU_MEMORY_FRACTION=0.8

Usage Examples

Task-Aware Code Generation

You: "Generate a React component for the user profile task"
Ollama + Buildable: "Based on task #123 'Create user profile page', here's a React component that matches your project structure..."

Code Review with Context

You: "Review this authentication function"
Ollama + Buildable: "Looking at your security requirements from task #456, this function should also include..."

Project-Specific Help

You: "How should I implement the payment system?"
Ollama + Buildable: "Based on your project's tech stack (Next.js + Stripe) and current payment tasks, I recommend..."

Troubleshooting

Common Issues

Ollama Not Starting

# Check if Ollama is running
ollama list

# Restart Ollama
ollama serve

Model Loading Errors

# Check available models
ollama list

# Pull model if missing
ollama pull codellama:7b

MCP Connection Issues

Verify Ollama is accessible at http://localhost:11434
Check Buildable API credentials
Ensure MCP server can reach both services

Performance Issues

Slow Responses

Use smaller models (7B instead of 13B)
Reduce context window
Close other applications
Consider GPU acceleration

Memory Issues

Reduce num_ctx parameter
Use smaller models
Increase system RAM
Enable swap if needed

Best Practices

Choose appropriate models based on your hardware
Start with smaller models and upgrade as needed
Monitor resource usage during development
Use specific prompts that reference your tasks
Keep Ollama updated for latest features

Security Benefits

Complete Privacy

All AI processing happens locally
No data sent to external services
Code never leaves your machine
Full control over model behavior

Compliance Friendly

Meets strict data requirements
No external dependencies for AI
Audit-friendly setup
Custom model training possible

Getting Help

Check Ollama documentation for model-specific issues
Review Buildable MCP setup guides
Join our Discord #ollama channel
Report integration bugs on GitHub

Overview​

Prerequisites​

Setup​

1. Install Ollama​

2. Pull a Model​

3. Start Ollama Server​

MCP Configuration​

Method 1: Direct Ollama + Buildable MCP​

Method 2: Buildable MCP with Ollama Backend​

Supported Clients​

Claude Desktop with Ollama​

Continue + Ollama​

Features​

Local AI Processing​

Buildable Integration​

Model Flexibility​

Recommended Models​

For Development Tasks​

CodeLlama (Recommended)​

Mistral​

Llama 2​

For Specific Use Cases​

Code Review​

Documentation​

Debugging​

Performance Tuning​

Hardware Requirements​

Minimum (7B models)​

Recommended (13B models)​

Optimal (34B+ models)​

Ollama Configuration​

Memory Management​

Usage Examples​

Task-Aware Code Generation​

Code Review with Context​

Project-Specific Help​

Troubleshooting​

Common Issues​

Ollama Not Starting​

Model Loading Errors​

MCP Connection Issues​

Performance Issues​

Slow Responses​

Memory Issues​

Best Practices​

Security Benefits​

Complete Privacy​

Compliance Friendly​

Getting Help​