🦞 OpenClaw Bootcamp

DAY 03 / 16

🦞

OpenClaw Bootcamp · Day 3

Models & Ollama
The Intelligence Layer

OpenClaw is model-agnostic. Today you learn every provider it supports, when to use each model, and how to run Llama 3 locally with zero API cost using Ollama.

Cloud · Local · Hybrid Step-by-Step Ollama Setup Zero Cost Option

🦞 OpenClaw Bootcamp

DAY 03 / 16

What We Cover Today

Day 3 Agenda

🧠

Model-Agnostic

Why OpenClaw decouples the intelligence layer and what that means for you.

☁️

Cloud Providers

Anthropic, OpenAI, Google, and Chinese models. Capabilities and trade-offs.

💻

Ollama Setup

Install Ollama, pull Llama 3.2, connect it to OpenClaw. Full walkthrough.

💰

Cost & Strategy

Pricing comparison, hardware guide, and hybrid local + cloud strategies.

📊

Model Selection

Which model for which task. Decision framework you can actually use.

⚙️

Configuration

How to switch models in your config file. One change, no code.

🔨

Hardware Guide

Raspberry Pi to Mac Studio. What runs where and at what speed.

🎯

Homework

Get Ollama running, test a local model, compare it to your cloud model.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Today's Goal

By the end you will have three things

01

A clear understanding of every model provider OpenClaw supports

02

Ollama installed and a local model responding to your agent

03

A strategy for which model to use for which type of task

🦞 OpenClaw Bootcamp

DAY 03 / 16

60-Second Recap

Day 2 Recap

What You Built

✓Node.js installed, OpenClaw running
✓Gateway on port 18789, dashboard open
✓First message sent, memory working
✓Telegram connected via BotFather

What Carries Forward

→Config at ~/.openclaw/openclaw.json
→Model set under agents.defaults.model.primary
→You picked one model during onboarding
→Today we explore every other option

🦞 OpenClaw Bootcamp

DAY 03 / 16

Core Architecture

Why Model-Agnostic Matters

OpenClaw's Gateway abstracts the model layer completely. The agent issues a request, receives a response. Which provider processes it is entirely configurable.

🔓

No Vendor Lock-In

Switching from Claude to GPT to a local Llama model requires a single config change and a restart. No code changes.

📈

Intelligent Routing

Use different models for different tasks. Cheap model for heartbeat. Powerful model for analysis. Local model for sensitive data.

🛡️

Future-Proof

Pricing changes. APIs evolve. New models launch. Your agent infrastructure, memory, and Skills survive all of it.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Provider Overview

The Model Landscape

☁️

Anthropic

Opus 4.6, Sonnet 4.6, Haiku 4.5

🤖

OpenAI

GPT-5.4, GPT-4o, GPT-4o Mini

💡

Google

Gemini 3.1 Pro, Gemini 3 Flash

💻

Local (Ollama)

100+ models, zero API cost

Plus

Chinese models: DeepSeek V3.2, Kimi K2.5, GLM-5. Competitive performance at roughly 1/10th the cost of US cloud models.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Cloud Providers

Anthropic Claude

OpenClaw's most community-tested integration. Current models as of March 2026:

🧠

Most Capable

Claude Opus 4.6

Exceptional complex reasoning, precise instruction following, sophisticated writing. 1M token context window. Use for high-value tasks where quality justifies spend.

$5 / $25 per 1M tokens

⚡

Balanced

Claude Sonnet 4.6

Strong quality without the premium pricing of Opus. Many users run Sonnet for interactive conversations and Haiku for automated tasks.

$3 / $15 per 1M tokens

💫

Fast & Cheap

Claude Haiku 4.5

Fast, cost-effective, performs well on structured tasks, data extraction, summarization, and routine decision-making. Excellent for heartbeat cycles.

$1 / $5 per 1M tokens

🦞 OpenClaw Bootcamp

DAY 03 / 16

Cloud Providers

OpenAI GPT

🌟

Latest

GPT-5.4

Released March 2026. Most capable and efficient frontier model. 33% fewer errors than GPT-5.2. Thinking and Pro variants available.

$2.50 / $15.00 per 1M tokens

🧠

Established

GPT-5

Released Aug 2025. Strong reasoning, reliable tool use, multimodal. Routes between fast and deep reasoning models automatically.

$1.25 / $10.00 per 1M tokens

💪

Workhorse

GPT-4o

Strong general reasoning, reliable tool use, good code generation, 128K context. Multimodal via the vision Skill.

$2.50 / $10.00 per 1M tokens

⚡

Budget

GPT-4o Mini

~16x cheaper than GPT-4o. Retains strong performance on structured tasks, summarization, and instruction following.

$0.15 / $0.60 per 1M tokens

🦞 OpenClaw Bootcamp

DAY 03 / 16

Cloud Providers

Google Gemini

💎

Frontier

Gemini 3.1 Pro

Released Feb 2026. Reasoning-first model optimized for complex agentic workflows and coding. 1M token context window. Adaptive thinking. SWE-bench 80.6%.

⚡

Fast & Capable

Gemini 3 Flash

Pro-grade reasoning with Flash-level latency and cost efficiency. Built for speed. Also available: Gemini 2.5 Pro, Gemini 2.5 Flash for older integrations.

Note

The Gemini integration in OpenClaw was added through community contributions. Available through direct API or Vertex AI for enterprise. Test thoroughly before production use.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Cloud Providers

Chinese Models

Competitive reasoning quality at roughly 1/10th the cost of US cloud models.

Model	Origin	SWE-bench	Strength
GLM-5	Zhipu AI (China)	77.8%	744B MoE, MIT license, $1/$3.20
Kimi K2.5	Moonshot (China)	76.8%	Strong reasoning, 1T params
DeepSeek V3.2	DeepSeek (China)	73.0%	Extreme cost efficiency

Context

GLM-5's 77.8% and Kimi K2.5's 76.8% on SWE-bench approach Claude Opus 4.6 (80.8%) and GPT-5.2 (80.0%). For many agent tasks the gap is negligible. GLM-5 API: $1/$3.20 per 1M tokens. US users should verify latency and compliance.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Decision Framework

Choosing the Right Model

Use Case	Recommended Model
Complex reasoning & analysis	GPT-5.4 or Claude Opus 4.6
Heartbeat / background monitoring	GPT-4o Mini or Claude Haiku 4.5
Privacy-sensitive tasks	Llama 3.2 8B or Mistral 7B (local)
Code generation & debugging	GPT-5.4 or Claude Opus 4.6
Zero cost constraint	Llama 3.1 70B (local, high-end hardware)
Multilingual tasks	Gemini 3.1 Pro or Qwen 2.5 (local)

Start with a single model for simplicity. Once stable, introduce model routing to optimize cost and quality across different task types.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Cost Comparison

API Pricing

Model	Input / 1M tokens	Output / 1M tokens
Claude Opus 4.6	$5.00	$25.00
Claude Sonnet 4.6	$3.00	$15.00
Claude Haiku 4.5	$1.00	$5.00
GPT-5.4	$2.50	$15.00
GPT-4o	$2.50	$10.00
GPT-4o Mini	$0.15	$0.60
Local (Ollama)	$0	$0

Cautionary Tale

A power user reported burning through 180 million tokens in weeks after enabling aggressive heartbeat monitoring with an expensive model and accidentally creating a feedback loop. With frontier pricing, that was hundreds of dollars. Always set spending limits.

🦞 OpenClaw Bootcamp

DAY 03 / 16

What You Will Actually Pay

Real-World Cost Examples

Light User

$5–15

/month

Morning briefing heartbeat. Occasional tasks. GPT-4o Mini for most, GPT-4o for complex. Heartbeat every 60 min.

Power User

$30–60

/month

Active heartbeat monitoring. Regular interactive use. Claude Haiku 4.5 for heartbeat, Claude Opus 4.6 for complex work.

Full Local (Ollama)

$0

API cost

Mac Mini with 16GB RAM. Llama 3.2 for everything. Only cost is electricity: roughly $1–2/month.

OpenClaw itself is free and open source (MIT license). You only pay for the intelligence layer.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Local Models

What Is Ollama

An open-source tool for running large language models locally. It presents a clean API compatible with OpenAI's API spec, so OpenClaw connects to it using the same interface it uses for cloud providers.

📦

Simple Model Management

ollama pull llama3.2 downloads a model. ollama list shows what you have. No manual GGUF downloads.

⚡

Optimized Performance

Built on llama.cpp. GPU acceleration on NVIDIA, AMD, and Apple Silicon. Often 2–3x faster than less optimized stacks.

🔒

Complete Privacy

Zero data leaves your machine. No internet required for inference. Your conversations stay on your hardware.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Step 1

Installing Ollama

macOS & Linux

# One command install
curl -fsSL https://ollama.com/install.sh | sh

# Verify installation
ollama --version

On macOS, Ollama installs as a menu bar application that manages the server lifecycle automatically.

Windows

# Download installer from
ollama.com

# Or use winget
winget install Ollama.Ollama

Ollama runs as a background service on Windows. Same API, same models, same port.

After Install

Ollama starts an API server on http://localhost:11434. Verify it: curl http://localhost:11434/api/tags — you should see a JSON response.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Step 2

Pull Your First Model

# Download Llama 3.2 8B (~5GB)
ollama pull llama3.2

# Test it interactively
ollama run llama3.2 "Hello, are you working?"

# Optionally grab more models
ollama pull mistral:7b-instruct
ollama pull phi4-mini:latest

# See everything you have
ollama list

If you see a response from ollama run llama3.2, Ollama is working. The API server is live on localhost:11434.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Step 3

Configure OpenClaw for Ollama

OpenClaw treats Ollama as just another LLM provider. One config change is all it takes.

// ~/.openclaw/openclaw.json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "ollama/llama3.2",
        "base_url": "http://localhost:11434"
      }
    }
  }
}

Or use the CLI

Run openclaw configure and select Ollama as your provider. It writes the config for you. Restart the Gateway and test with a message.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Tested & Recommended

Local Model Recommendations

Model	Size	RAM	Best For
Llama 3.2 8B Instruct	~5GB	8GB	Balanced performance, good tool use
Mistral 7B Instruct v0.3	~4GB	8GB	Fast responses, good instruction following
Qwen 2.5 14B Instruct	~9GB	16GB	Strong reasoning, excellent multilingual
Llama 3.1 70B Instruct	~40GB	64GB	Near-GPT-4 quality, high-end hardware
Phi-4 Mini (3.8B)	~2.5GB	4GB	Raspberry Pi and low-power devices

Tool Use Matters

Local models vary in their ability to generate well-formatted tool calls. Models with "-instruct" or "-chat" suffixes perform better. Llama 3.2 Instruct and Mistral 7B Instruct are community favorites for reliable tool use in OpenClaw.

🦞 OpenClaw Bootcamp

DAY 03 / 16

What Runs Where

Hardware Guide

Hardware	Recommended Model	Speed
Raspberry Pi 5 (8GB)	Phi-4 Mini or Gemma 2 2B	3–6 tokens/sec
Mac Mini M2 (8GB)	Llama 3.2 8B	25–40 tokens/sec
Mac Mini M4 (16GB)	Qwen 2.5 14B	20–35 tokens/sec
Mac Studio M4 (64GB)	Llama 3.1 70B	15–25 tokens/sec
PC with RTX 4090 (24GB)	Llama 3.1 70B Q4	40–60 tokens/sec

Apple Silicon Advantage

Apple Silicon Macs benefit from unified memory architecture. The GPU and CPU share the same memory pool, meaning an M4 Mac Mini with 24GB RAM can run a 20B parameter model with the GPU fully utilized — something impossible on a discrete GPU with only 12GB VRAM.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Get More From Local Models

Performance Tips

01

Use Q5_K_M Quantization

When multiple quantization levels are available, Q5_K_M provides a good balance of quality and speed. Roughly Q8 quality at Q4 speed.
02

Limit Context Window Size

Local models run slower with larger context. For heartbeat tasks that don't need extensive history, set a smaller num_ctx to improve throughput.
03

Keep Ollama Running Continuously

Model loading takes 10–30 seconds. Set OLLAMA_KEEP_ALIVE to keep models loaded in memory between calls. Once loaded, responses are fast.
04

Reserve System RAM

Close memory-intensive applications when running large models. More RAM for Ollama means less disk paging and dramatically faster inference.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Best of Both Worlds

Hybrid: Local + Cloud

Many experienced OpenClaw users settle on a hybrid approach. Use the right model for the right job.

🕒

Heartbeat Tasks

Use a local model. Structured, repetitive tasks where an 8B model performs fine. Zero API cost for the most frequent token consumption.

🔒

Sensitive Data

Use a local model. Legal documents, health data, financial analysis. Route to local regardless of quality considerations.

🧠

Complex Interactive

Use a cloud model. When you need the best reasoning, nuanced writing, or complex code generation, route to GPT-5.4 or Claude Opus 4.6.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Keep Costs Predictable

Cost-Saving Strategies

01

Use Cheap Models for Heartbeat

Most automated monitoring doesn't need frontier intelligence. Claude Opus 4.6 heartbeat: ~$15–30/month. Claude Haiku 4.5 for the same tasks: ~$3–5/month. GPT-4o Mini: even less.
02

Set API Spending Limits Immediately

Both OpenAI and Anthropic allow monthly caps. Set one at $5–$10 before your agent runs unattended. Protects you from runaway tasks.
03

Extend the Heartbeat Interval

30-minute to 60-minute cycles cuts background API usage by 50%. For most monitoring, hourly checks are sufficient.
04

Optimize Prompt Efficiency

Long system prompts and bloated memory files increase every request's token count. Periodically prune memory and tighten your system prompt.

🦞 OpenClaw Bootcamp

DAY 03 / 16

One Change, No Code

Switching Models

Three ways to change your model. All take effect without rebuilding anything.

CLI

openclaw configure

Interactive menu. Select your new provider and model.

Config File

// openclaw.json
"primary": "ollama/llama3.2"

Edit directly. Gateway watches the file and applies changes.

Dashboard

openclaw dashboard
# Config tab → Model

Form UI or raw JSON editor. Changes hot-reload automatically.

🦞 OpenClaw Bootcamp

DAY 03 / 16

Before Day 4

Day 3 Homework

01

Install Ollama and Pull Llama 3.2

Run the install command for your platform. Pull llama3.2. Run ollama run llama3.2 "Hello" and confirm a response.
02

Switch OpenClaw to Ollama

Run openclaw configure or edit your config. Set Ollama as the provider. Send a message through the dashboard and confirm it works.
03

Compare Cloud vs Local

Ask your agent the same question with your cloud model and your local model. Notice the difference in speed, quality, and cost.
04

Set a Spending Limit

Go to your API provider's dashboard and set a monthly cap. $5–$10 is a good starting point. Do this before Day 4.
05

Check ollama list

Verify your downloaded models. Try pulling one more model that fits your hardware. Experiment with ollama run to test it.

🦞 OpenClaw Bootcamp

DAY 03 / 16

🦞

Next Up

Your agent now runs on any model

You understand the full model landscape. You have Ollama running locally. You know which model to use for which task and how much it costs.

100+

Local models via Ollama

$0

with local models

1

Config change to switch

Day 4 Preview

Cost Optimization — Now that you know the models, we go deep on controlling what you spend. Two-tier processing, heartbeat tuning, and strategies that cut your bill by 70% or more.

Models & OllamaThe Intelligence Layer

Day 3 Agenda

Model-Agnostic

Cloud Providers

Ollama Setup

Cost & Strategy

Model Selection

Configuration

Hardware Guide

Homework

By the end you will have three things

Day 2 Recap

Why Model-Agnostic Matters

No Vendor Lock-In

Intelligent Routing

Future-Proof

The Model Landscape

Anthropic Claude

Claude Opus 4.6

Claude Sonnet 4.6

Claude Haiku 4.5

OpenAI GPT

GPT-5.4

GPT-5

GPT-4o

GPT-4o Mini

Google Gemini

Gemini 3.1 Pro

Gemini 3 Flash

Chinese Models

Choosing the Right Model

API Pricing

Real-World Cost Examples

What Is Ollama

Simple Model Management

Optimized Performance

Complete Privacy

Installing Ollama

Pull Your First Model

Configure OpenClaw for Ollama

Local Model Recommendations

Hardware Guide

Performance Tips

Use Q5_K_M Quantization

Limit Context Window Size

Keep Ollama Running Continuously

Reserve System RAM

Hybrid: Local + Cloud

Heartbeat Tasks

Sensitive Data

Complex Interactive

Cost-Saving Strategies

Use Cheap Models for Heartbeat

Set API Spending Limits Immediately

Extend the Heartbeat Interval

Optimize Prompt Efficiency

Switching Models

Day 3 Homework

Install Ollama and Pull Llama 3.2

Switch OpenClaw to Ollama

Compare Cloud vs Local

Set a Spending Limit

Check ollama list

Your agent now runs on any model

Models & Ollama
The Intelligence Layer

Check `ollama list`