🦞 OpenClaw Bootcamp
DAY 04 / 16
🦞
OpenClaw Bootcamp · Day 4

Cost Optimization
Make Your Agent Work for You, Not Against Your Wallet

The difference between a $70 monthly bill and a $17 one is not quality — it’s configuration. Same agent, same tasks, same results. Just smarter about where every token goes.

Two-Tier Models Heartbeat Math 70% Cost Reduction Prompt Caching
🦞 OpenClaw Bootcamp
DAY 04 / 16
What We Cover Today

Day 4 Agenda

⚖️

Two-Tier Processing

Cheap model for background tasks, premium model for conversations. The single biggest cost lever.

💓

Heartbeat Math

Where most of the money actually goes — and why most people don’t realize it until the bill arrives.

📜

Prompt Bloat

System prompts and memory silently inflate every API call. Learn to audit and prune them.

😎

Spending Limits

Guardrails before your agent runs unattended. The cautionary tale you don’t want to live through.

📈

Cost Monitoring

Dashboard Cost tab and CLI stats. You can’t optimize what you can’t measure.

Prompt Caching

90% discount on repeated system prompts. A powerful feature most people don’t know exists.

The 70% Checklist

Six concrete actions that cut your bill by over seventy percent. Do all of them today.

🎯

Homework

Configure dual-model, extend heartbeat, prune SOUL.md, check spending. Save money this week.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Today’s Goal

By the end you will have three things

01
A dual-model config where your agent automatically uses cheap models for routine work and premium models for real conversations
02
Specific numbers: exactly what you’re spending and where, tied to specific behaviors in your configuration
03
A 70%+ reduction in your monthly bill — your heartbeat is probably running on a model 10× more expensive than it needs to be
🦞 OpenClaw Bootcamp
DAY 04 / 16
60-Second Recap

Day 3 Recap

Model Landscape You Learned
  • Claude Opus 4.6: $5 input / $25 output per million tokens
  • Sonnet 4.6: $3 / $15  •  Haiku 4.5: $1 / $5
  • GPT-4o Mini: $0.15 / $0.60  •  Ollama: $0
  • Pulled Llama 3.2 and ran a model locally on your hardware
Key Insight Carrying Forward
  • Model selection is the single largest variable in your monthly cost
  • You set a spending limit with your API provider
  • Today we operationalize that insight into real savings
🦞 OpenClaw Bootcamp
DAY 04 / 16
The Problem

What an Unoptimized Deployment Actually Costs

Typical Unoptimized Setup
  • → Claude Opus 4.6 as the only model
  • → Heartbeat every 30 minutes = 48 cycles/day
  • → ~1,000 input tokens + 500 output tokens per cycle
  • → 20 interactive messages/day @ ~2,000 tokens each
The Arithmetic
  • Heartbeat input: 48 × 1k × $5/M = $0.24/day
  • Heartbeat output: 48 × 0.5k × $25/M = $0.60/day
  • Heartbeat total: $0.84/day → $25.20/month
  • Interactive use: ~$45–50/month additional
$70
per month — before you’ve optimized anything
The Core Insight

Most of that $70 is going to background tasks that don’t need a frontier model. Your heartbeat is doing structured, repetitive checks. It doesn’t need Opus. That’s the problem. Now let’s fix it.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Foundations

What You’re Actually Paying For

Every API Call Has Two Costs
  • Input tokens: system prompt + memory context + conversation history + your message
  • Output tokens: the model’s response — always 3–5× more expensive than input
What Catches People

Your SOUL.md and memory context get sent with every single request. A 2,000-token SOUL.md plus 1,000 tokens of memory = 3,000 overhead tokens before the model even sees your question. At 48 heartbeat cycles/day that’s 144,000 overhead tokens daily that have nothing to do with the monitoring task.

Why This Matters for Agents

In a normal chatbot, the system prompt is a rounding error. In an agent running automated cycles every 30 minutes, it’s the main cost driver. Prompt efficiency is not a nice-to-have — it’s essential.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Core Strategy

Two Models Instead of One

🌟 Secondary Model — Background

Claude Haiku 4.5 at $1/$5 or GPT-4o Mini at $0.15/$0.60. Handles heartbeat cycles, RSS scanning, server checks, calendar monitoring. More than enough capability for structured, repetitive tasks.

$0.15–$1.00 / million input tokens
👑 Primary Model — Interactive

Claude Opus 4.6 or Sonnet 4.6. When you’re actively talking to your agent, you get premium reasoning. The price is worth it for real conversations. Not for checking if your server is up.

$3–$5 / million input tokens
The Result

That $25.20/month heartbeat cost on Opus drops to $5.04/month on Haiku. Same checks. Same frequency. 80% savings. Add interactive savings from Sonnet vs Opus and total bill becomes a fraction of what you started with.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Configuration

Set Up Dual-Model in openclaw.json

// ~/.openclaw/openclaw.json { "agents": { "defaults": { "model": { "primary": "claude-sonnet-4-6", "secondary": "claude-haiku-4-5" } } } } // Budget alternative for secondary: "secondary": "gpt-4o-mini"
How Routing Works

OpenClaw automatically routes heartbeat cycles and background tasks to secondary, and direct messages from you to primary. No manual switching.

After This Change

Save the file and restart the Gateway. Every heartbeat cycle now uses your cheap model. Every direct message uses your premium one. One config change — immediate savings from this moment forward.

🦞 OpenClaw Bootcamp
DAY 04 / 16
The Numbers

Heartbeat Math: Before vs After

ModelCycles/DayInput Cost/DayOutput Cost/DayTotal/DayTotal/Month
Opus 4.648 (30-min)$0.24$0.60$0.84$25.20
Haiku 4.548 (30-min)$0.048$0.12$0.168$5.04
Haiku 4.524 (60-min)$0.024$0.06$0.084$2.52

Assumes 1,000 input tokens and 500 output tokens per cycle. Haiku at $1/$5 per million tokens.

Switching to Haiku saves 80% on heartbeat costs. Extending to 60-minute cycles saves another 50% on top of that. The quality difference for “check if my server is responding” is negligible. Haiku handles structured, repetitive monitoring perfectly.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Configuration

Adjusting Your Heartbeat Interval

// ~/.openclaw/openclaw.json { "agents": { "defaults": { "heartbeat": { "every": "60m" } } } } // Valid formats: "every": "30m" // 30 minutes (default) "every": "60m" // 60 minutes (recommended) "every": "2h" // 2 hours
Be Honest About What You Need

Server uptime? An hour between checks is fine. RSS feeds? News doesn’t go stale in 30 minutes. Calendar? Your meetings aren’t reshuffling every half hour. Most users run 30-minute heartbeats because that was the default — not because their use case demands it.

Recommendation

Start at 60 minutes. You still get timely monitoring and you’ve halved your heartbeat costs. Move to 30 minutes only if you find a specific task that genuinely needs it.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Hidden Cost Driver

Prompt Bloat: The Invisible Tax

The Problem

A SOUL.md that’s grown to 3,000 tokens over time means 3,000 tokens of input on every single call. At 48 heartbeat cycles/day that’s 144,000 overhead tokens daily. On Opus: $0.72/day or $21.60/month. Just from a bloated system prompt.

The Fix
  • → Remove instructions your agent has already internalized
  • → Cut verbose explanations to concise directives
  • → Audit periodically — SOUL.md grows by default
  • → Every 100 tokens trimmed saves money on every call
Mental Model

Your SOUL.md is a tax on every API call. Keep it as lean as possible. The goal isn’t a short SOUL.md — it’s an efficient one. Every word should earn its token cost.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Memory Efficiency

Prune Your Agent’s Memory Context

OpenClaw’s memory accumulates over time by design. But old, outdated entries still cost tokens on every call. Run openclaw memory stats to see what your agent is carrying.

🔁

Remove Duplicates

Agent learned the same preference twice? Delete one. Memory doesn’t need redundancy.

📅

Delete Outdated Info

Old project status, past events, preferences you’ve changed. Gone from memory = gone from every call’s cost.

✂️

Condense Verbose Entries

“Prefers bullet-point format with technical detail and code examples” → “Prefers: bullets, technical, code examples.” Same info, fewer tokens.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Guardrails

Set Spending Limits Before Your Agent Runs Unattended

Anthropic Console

Set a monthly usage limit at console.anthropic.com. When you hit it, API calls stop. Hard cap, no surprises. Start at $10–$20 while you’re learning. Raise it when you understand your actual patterns.

OpenAI Platform

Set a monthly spend threshold — OpenAI will email you when you cross it, but as of 2025 these limits are alert-only and do not hard-stop API usage. Start at $10–$15 and monitor closely. Both providers offer per-project API keys with individual limits — create separate keys if running multiple agents so one runaway process can’t drain everything.

⚠️ Cautionary Tale

Someone set up aggressive heartbeat monitoring with Opus, accidentally created a feedback loop in their task config, and burned through 180 million tokens in a few weeks. Hundreds of dollars. A $5 spending limit would have caught it in hours.

Set the limit now. Raise it later.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Reference

Full Model Comparison

ModelInput / M tokensOutput / M tokensBest For
Claude Opus 4.6$5.00$25.00High-value interactive tasks requiring deep reasoning
Claude Sonnet 4.6$3.00$15.00Daily interactive use — best default for conversations
Claude Haiku 4.5$1.00$5.00★ Heartbeat & background automation
GPT-4o Mini$0.15$0.60Budget background tasks — cheaper than Haiku
Gemini 2.5 Flash-Lite$0.10$0.40High-volume automation — Google’s ultra-cheap option
Ollama (local)$0.00$0.00Free forever — good for heartbeat if hardware allows

Llama 3.2 via Ollama comes in 1B and 3B text models plus 11B and 90B vision variants. The 3B (default with ollama pull llama3.2) is surprisingly capable for structured monitoring tasks.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Real Numbers

Before & After Optimization

Before — Unoptimized
  • Claude Opus 4.6 for everything
  • 30-minute heartbeat interval
  • Bloated SOUL.md and memory
  • $70.80 / month
After — Optimized
  • Haiku for heartbeat, Sonnet for chat
  • 60-minute heartbeat interval
  • Pruned prompts and memory
  • $17.00 / month
76%
reduction. Same agent. Same tasks. Same quality where it actually matters.
🦞 OpenClaw Bootcamp
DAY 04 / 16
Ongoing Visibility

Monitoring Your Spending

📈

Dashboard Cost Tab

Token usage broken down by model, task type, and time period. Check weekly. Look for patterns — is one heartbeat task consuming disproportionate tokens?

💻

CLI Stats

Run openclaw stats --this-month for a quick summary. Total tokens, estimated cost by provider, heartbeat vs interactive breakdown.

✈️

Telegram Alerts

Set up cost alerts so your agent messages you when you’ve hit 50% of your monthly budget. A mid-month warning beats a surprise bill every time.

Build the Habit

Check costs weekly for the first month. That’s when you catch misconfiguration and unexpected usage. After that, monthly is fine — you’ll know your patterns.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Advanced Optimization

Prompt Caching: 90% Off Your System Prompt

Anthropic Caching
  • 90% discount on cached input tokens
  • Requires cache_control markers in prompts
  • 5-minute TTL (default) or 1-hour TTL option
  • Minimum token threshold applies (varies by model)
  • Cache writes: 1.25× base input (5-min TTL) or base input (1-hour TTL)
OpenAI Caching
  • 50% discount on cached input tokens
  • Fully automatic — no markers or config needed
  • Kicks in at 1,024+ token prefix, in 128-token increments
  • Extended caching option available (24-hour duration)
Why It Matters for Agents

Your system prompt is identical on every heartbeat cycle. That’s a perfect cache target. If your system prompt is 2,000 tokens and you run 48 heartbeat cycles a day, caching saves you the equivalent of tens of thousands of input tokens per day. On Opus pricing, that’s real dollars every single day.

OpenClaw handles cache_control markers automatically when caching is enabled in your config.
🦞 OpenClaw Bootcamp
DAY 04 / 16
Action Plan

The 70% Checklist

  • 01

    Add a Secondary Model

    Set agents.defaults.model.secondary to Haiku or GPT-4o Mini. Biggest single lever.

  • 02

    Extend Heartbeat to 60 Minutes

    Change agents.defaults.heartbeat.every from "30m" to "60m". Half the calls, half the cost.

  • 03

    Prune Your SOUL.md

    Cut to essential directives only. Every token you remove saves money on every single call.

  • 04

    Run Memory Stats and Clean Up

    openclaw memory stats — remove duplicates, condense verbose entries, delete outdated info.

  • 05

    Set Spending Limits

    $10–$20 on Anthropic and/or OpenAI dashboards. Set it now. Raise it when you know your patterns.

  • 06

    Check the Dashboard Cost Tab Weekly

    You can’t optimize what you don’t measure. Make it a habit for the first month.

🦞 OpenClaw Bootcamp
DAY 04 / 16
Before Day 5

Day 4 Homework

  • 01

    Configure Dual-Model Setup

    Add secondary to your model config. Set primary to your premium model, secondary to Haiku or GPT-4o Mini. Restart the Gateway and verify both are being used via the Dashboard.

  • 02

    Change Heartbeat to 60 Minutes

    Update heartbeat.every to "60m". Confirm the new cycle timing in the Dashboard heartbeat tab.

  • 03

    Audit SOUL.md and Memory

    Read your SOUL.md and trim anything verbose or redundant. Run openclaw memory stats and prune old entries. Note how many tokens you saved.

  • 04

    Check Spending and Set a Limit

    Run openclaw stats --this-month. Set a spending limit on your API provider if you haven’t. Screenshot your current cost to compare after a week of optimized config.

🦞 OpenClaw Bootcamp
DAY 04 / 16
🦞
Coming Up

Day 5: Channels
Your Agent, Everywhere

Your agent is running and optimized. Now it needs to live where you already are. On Day 5 we connect Telegram, WhatsApp, Discord, iMessage, Slack, and the built-in WebChat — and cover security, formatting, and which channel to reach for in every situation.

Telegram · WhatsApp · Discord iMessage · Slack · WebChat Security & Access Control