🦞 OpenClaw Bootcamp
DAY 03 / 16
🦞
OpenClaw Bootcamp · Day 3

Models & Ollama
The Intelligence Layer

OpenClaw is model-agnostic. Today you learn every provider it supports, when to use each model, and how to run Llama 3 locally with zero API cost using Ollama.

Cloud · Local · Hybrid Step-by-Step Ollama Setup Zero Cost Option
🦞 OpenClaw Bootcamp
DAY 03 / 16
What We Cover Today

Day 3 Agenda

🧠

Model-Agnostic

Why OpenClaw decouples the intelligence layer and what that means for you.

☁️

Cloud Providers

Anthropic, OpenAI, Google, and Chinese models. Capabilities and trade-offs.

💻

Ollama Setup

Install Ollama, pull Llama 3.2, connect it to OpenClaw. Full walkthrough.

💰

Cost & Strategy

Pricing comparison, hardware guide, and hybrid local + cloud strategies.

📊

Model Selection

Which model for which task. Decision framework you can actually use.

⚙️

Configuration

How to switch models in your config file. One change, no code.

🔨

Hardware Guide

Raspberry Pi to Mac Studio. What runs where and at what speed.

🎯

Homework

Get Ollama running, test a local model, compare it to your cloud model.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Today's Goal

By the end you will have three things

01
A clear understanding of every model provider OpenClaw supports
02
Ollama installed and a local model responding to your agent
03
A strategy for which model to use for which type of task
🦞 OpenClaw Bootcamp
DAY 03 / 16
60-Second Recap

Day 2 Recap

What You Built
  • Node.js installed, OpenClaw running
  • Gateway on port 18789, dashboard open
  • First message sent, memory working
  • Telegram connected via BotFather
What Carries Forward
  • Config at ~/.openclaw/openclaw.json
  • Model set under agents.defaults.model.primary
  • You picked one model during onboarding
  • Today we explore every other option
🦞 OpenClaw Bootcamp
DAY 03 / 16
Core Architecture

Why Model-Agnostic Matters

OpenClaw's Gateway abstracts the model layer completely. The agent issues a request, receives a response. Which provider processes it is entirely configurable.

🔓

No Vendor Lock-In

Switching from Claude to GPT to a local Llama model requires a single config change and a restart. No code changes.

📈

Intelligent Routing

Use different models for different tasks. Cheap model for heartbeat. Powerful model for analysis. Local model for sensitive data.

🛡️

Future-Proof

Pricing changes. APIs evolve. New models launch. Your agent infrastructure, memory, and Skills survive all of it.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Provider Overview

The Model Landscape

☁️
Anthropic
Opus 4.6, Sonnet 4.6, Haiku 4.5
🤖
OpenAI
GPT-5.4, GPT-4o, GPT-4o Mini
💡
Google
Gemini 3.1 Pro, Gemini 3 Flash
💻
Local (Ollama)
100+ models, zero API cost
Plus

Chinese models: DeepSeek V3.2, Kimi K2.5, GLM-5. Competitive performance at roughly 1/10th the cost of US cloud models.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Cloud Providers

Anthropic Claude

OpenClaw's most community-tested integration. Current models as of March 2026:

🧠
Most Capable

Claude Opus 4.6

Exceptional complex reasoning, precise instruction following, sophisticated writing. 1M token context window. Use for high-value tasks where quality justifies spend.

$5 / $25 per 1M tokens
Balanced

Claude Sonnet 4.6

Strong quality without the premium pricing of Opus. Many users run Sonnet for interactive conversations and Haiku for automated tasks.

$3 / $15 per 1M tokens
💫
Fast & Cheap

Claude Haiku 4.5

Fast, cost-effective, performs well on structured tasks, data extraction, summarization, and routine decision-making. Excellent for heartbeat cycles.

$1 / $5 per 1M tokens
🦞 OpenClaw Bootcamp
DAY 03 / 16
Cloud Providers

OpenAI GPT

🌟
Latest

GPT-5.4

Released March 2026. Most capable and efficient frontier model. 33% fewer errors than GPT-5.2. Thinking and Pro variants available.

$2.50 / $15.00 per 1M tokens
🧠
Established

GPT-5

Released Aug 2025. Strong reasoning, reliable tool use, multimodal. Routes between fast and deep reasoning models automatically.

$1.25 / $10.00 per 1M tokens
💪
Workhorse

GPT-4o

Strong general reasoning, reliable tool use, good code generation, 128K context. Multimodal via the vision Skill.

$2.50 / $10.00 per 1M tokens
Budget

GPT-4o Mini

~16x cheaper than GPT-4o. Retains strong performance on structured tasks, summarization, and instruction following.

$0.15 / $0.60 per 1M tokens
🦞 OpenClaw Bootcamp
DAY 03 / 16
Cloud Providers

Google Gemini

💎
Frontier

Gemini 3.1 Pro

Released Feb 2026. Reasoning-first model optimized for complex agentic workflows and coding. 1M token context window. Adaptive thinking. SWE-bench 80.6%.

Fast & Capable

Gemini 3 Flash

Pro-grade reasoning with Flash-level latency and cost efficiency. Built for speed. Also available: Gemini 2.5 Pro, Gemini 2.5 Flash for older integrations.

Note

The Gemini integration in OpenClaw was added through community contributions. Available through direct API or Vertex AI for enterprise. Test thoroughly before production use.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Cloud Providers

Chinese Models

Competitive reasoning quality at roughly 1/10th the cost of US cloud models.

ModelOriginSWE-benchStrength
GLM-5Zhipu AI (China)77.8%744B MoE, MIT license, $1/$3.20
Kimi K2.5Moonshot (China)76.8%Strong reasoning, 1T params
DeepSeek V3.2DeepSeek (China)73.0%Extreme cost efficiency
Context

GLM-5's 77.8% and Kimi K2.5's 76.8% on SWE-bench approach Claude Opus 4.6 (80.8%) and GPT-5.2 (80.0%). For many agent tasks the gap is negligible. GLM-5 API: $1/$3.20 per 1M tokens. US users should verify latency and compliance.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Decision Framework

Choosing the Right Model

Use CaseRecommended Model
Complex reasoning & analysisGPT-5.4 or Claude Opus 4.6
Heartbeat / background monitoringGPT-4o Mini or Claude Haiku 4.5
Privacy-sensitive tasksLlama 3.2 8B or Mistral 7B (local)
Code generation & debuggingGPT-5.4 or Claude Opus 4.6
Zero cost constraintLlama 3.1 70B (local, high-end hardware)
Multilingual tasksGemini 3.1 Pro or Qwen 2.5 (local)

Start with a single model for simplicity. Once stable, introduce model routing to optimize cost and quality across different task types.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Cost Comparison

API Pricing

ModelInput / 1M tokensOutput / 1M tokens
Claude Opus 4.6$5.00$25.00
Claude Sonnet 4.6$3.00$15.00
Claude Haiku 4.5$1.00$5.00
GPT-5.4$2.50$15.00
GPT-4o$2.50$10.00
GPT-4o Mini$0.15$0.60
Local (Ollama)$0$0
Cautionary Tale

A power user reported burning through 180 million tokens in weeks after enabling aggressive heartbeat monitoring with an expensive model and accidentally creating a feedback loop. With frontier pricing, that was hundreds of dollars. Always set spending limits.

🦞 OpenClaw Bootcamp
DAY 03 / 16
What You Will Actually Pay

Real-World Cost Examples

Light User
$5–15
/month

Morning briefing heartbeat. Occasional tasks. GPT-4o Mini for most, GPT-4o for complex. Heartbeat every 60 min.

Power User
$30–60
/month

Active heartbeat monitoring. Regular interactive use. Claude Haiku 4.5 for heartbeat, Claude Opus 4.6 for complex work.

Full Local (Ollama)
$0
API cost

Mac Mini with 16GB RAM. Llama 3.2 for everything. Only cost is electricity: roughly $1–2/month.

OpenClaw itself is free and open source (MIT license). You only pay for the intelligence layer.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Local Models

What Is Ollama

An open-source tool for running large language models locally. It presents a clean API compatible with OpenAI's API spec, so OpenClaw connects to it using the same interface it uses for cloud providers.

📦

Simple Model Management

ollama pull llama3.2 downloads a model. ollama list shows what you have. No manual GGUF downloads.

Optimized Performance

Built on llama.cpp. GPU acceleration on NVIDIA, AMD, and Apple Silicon. Often 2–3x faster than less optimized stacks.

🔒

Complete Privacy

Zero data leaves your machine. No internet required for inference. Your conversations stay on your hardware.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Step 1

Installing Ollama

macOS & Linux
# One command install curl -fsSL https://ollama.com/install.sh | sh # Verify installation ollama --version

On macOS, Ollama installs as a menu bar application that manages the server lifecycle automatically.

Windows
# Download installer from ollama.com # Or use winget winget install Ollama.Ollama

Ollama runs as a background service on Windows. Same API, same models, same port.

After Install

Ollama starts an API server on http://localhost:11434. Verify it: curl http://localhost:11434/api/tags — you should see a JSON response.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Step 2

Pull Your First Model

# Download Llama 3.2 8B (~5GB) ollama pull llama3.2 # Test it interactively ollama run llama3.2 "Hello, are you working?" # Optionally grab more models ollama pull mistral:7b-instruct ollama pull phi4-mini:latest # See everything you have ollama list

If you see a response from ollama run llama3.2, Ollama is working. The API server is live on localhost:11434.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Step 3

Configure OpenClaw for Ollama

OpenClaw treats Ollama as just another LLM provider. One config change is all it takes.

// ~/.openclaw/openclaw.json { "agents": { "defaults": { "model": { "primary": "ollama/llama3.2", "base_url": "http://localhost:11434" } } } }
Or use the CLI

Run openclaw configure and select Ollama as your provider. It writes the config for you. Restart the Gateway and test with a message.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Tested & Recommended

Local Model Recommendations

ModelSizeRAMBest For
Llama 3.2 8B Instruct~5GB8GBBalanced performance, good tool use
Mistral 7B Instruct v0.3~4GB8GBFast responses, good instruction following
Qwen 2.5 14B Instruct~9GB16GBStrong reasoning, excellent multilingual
Llama 3.1 70B Instruct~40GB64GBNear-GPT-4 quality, high-end hardware
Phi-4 Mini (3.8B)~2.5GB4GBRaspberry Pi and low-power devices
Tool Use Matters

Local models vary in their ability to generate well-formatted tool calls. Models with "-instruct" or "-chat" suffixes perform better. Llama 3.2 Instruct and Mistral 7B Instruct are community favorites for reliable tool use in OpenClaw.

🦞 OpenClaw Bootcamp
DAY 03 / 16
What Runs Where

Hardware Guide

HardwareRecommended ModelSpeed
Raspberry Pi 5 (8GB)Phi-4 Mini or Gemma 2 2B3–6 tokens/sec
Mac Mini M2 (8GB)Llama 3.2 8B25–40 tokens/sec
Mac Mini M4 (16GB)Qwen 2.5 14B20–35 tokens/sec
Mac Studio M4 (64GB)Llama 3.1 70B15–25 tokens/sec
PC with RTX 4090 (24GB)Llama 3.1 70B Q440–60 tokens/sec
Apple Silicon Advantage

Apple Silicon Macs benefit from unified memory architecture. The GPU and CPU share the same memory pool, meaning an M4 Mac Mini with 24GB RAM can run a 20B parameter model with the GPU fully utilized — something impossible on a discrete GPU with only 12GB VRAM.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Get More From Local Models

Performance Tips

  • 01

    Use Q5_K_M Quantization

    When multiple quantization levels are available, Q5_K_M provides a good balance of quality and speed. Roughly Q8 quality at Q4 speed.

  • 02

    Limit Context Window Size

    Local models run slower with larger context. For heartbeat tasks that don't need extensive history, set a smaller num_ctx to improve throughput.

  • 03

    Keep Ollama Running Continuously

    Model loading takes 10–30 seconds. Set OLLAMA_KEEP_ALIVE to keep models loaded in memory between calls. Once loaded, responses are fast.

  • 04

    Reserve System RAM

    Close memory-intensive applications when running large models. More RAM for Ollama means less disk paging and dramatically faster inference.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Best of Both Worlds

Hybrid: Local + Cloud

Many experienced OpenClaw users settle on a hybrid approach. Use the right model for the right job.

🕒

Heartbeat Tasks

Use a local model. Structured, repetitive tasks where an 8B model performs fine. Zero API cost for the most frequent token consumption.

🔒

Sensitive Data

Use a local model. Legal documents, health data, financial analysis. Route to local regardless of quality considerations.

🧠

Complex Interactive

Use a cloud model. When you need the best reasoning, nuanced writing, or complex code generation, route to GPT-5.4 or Claude Opus 4.6.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Keep Costs Predictable

Cost-Saving Strategies

  • 01

    Use Cheap Models for Heartbeat

    Most automated monitoring doesn't need frontier intelligence. Claude Opus 4.6 heartbeat: ~$15–30/month. Claude Haiku 4.5 for the same tasks: ~$3–5/month. GPT-4o Mini: even less.

  • 02

    Set API Spending Limits Immediately

    Both OpenAI and Anthropic allow monthly caps. Set one at $5–$10 before your agent runs unattended. Protects you from runaway tasks.

  • 03

    Extend the Heartbeat Interval

    30-minute to 60-minute cycles cuts background API usage by 50%. For most monitoring, hourly checks are sufficient.

  • 04

    Optimize Prompt Efficiency

    Long system prompts and bloated memory files increase every request's token count. Periodically prune memory and tighten your system prompt.

🦞 OpenClaw Bootcamp
DAY 03 / 16
One Change, No Code

Switching Models

Three ways to change your model. All take effect without rebuilding anything.

CLI
openclaw configure

Interactive menu. Select your new provider and model.

Config File
// openclaw.json "primary": "ollama/llama3.2"

Edit directly. Gateway watches the file and applies changes.

Dashboard
openclaw dashboard # Config tab → Model

Form UI or raw JSON editor. Changes hot-reload automatically.

🦞 OpenClaw Bootcamp
DAY 03 / 16
Before Day 4

Day 3 Homework

  • 01

    Install Ollama and Pull Llama 3.2

    Run the install command for your platform. Pull llama3.2. Run ollama run llama3.2 "Hello" and confirm a response.

  • 02

    Switch OpenClaw to Ollama

    Run openclaw configure or edit your config. Set Ollama as the provider. Send a message through the dashboard and confirm it works.

  • 03

    Compare Cloud vs Local

    Ask your agent the same question with your cloud model and your local model. Notice the difference in speed, quality, and cost.

  • 04

    Set a Spending Limit

    Go to your API provider's dashboard and set a monthly cap. $5–$10 is a good starting point. Do this before Day 4.

  • 05

    Check ollama list

    Verify your downloaded models. Try pulling one more model that fits your hardware. Experiment with ollama run to test it.

🦞 OpenClaw Bootcamp
DAY 03 / 16
🦞
Next Up

Your agent now runs on any model

You understand the full model landscape. You have Ollama running locally. You know which model to use for which task and how much it costs.

100+
Local models via Ollama
$0
with local models
1
Config change to switch
Day 4 Preview

Cost Optimization — Now that you know the models, we go deep on controlling what you spend. Two-tier processing, heartbeat tuning, and strategies that cut your bill by 70% or more.