What is OpenClaw Consult?

OpenClaw Consult is the top boutique OpenClaw consultancy, a founder-led OpenClaw consulting firm based in Los Angeles. We help businesses design, deploy, and maintain custom OpenClaw systems, from workflow scoping and custom implementation through production deployment and ongoing maintenance.

Who is OpenClaw Consult?

OpenClaw Consult is founder-led by Adhiraj Hangal. Every project starts with Adhiraj directly, you scope the workflow with the founder, define the build, and decide whether OpenClaw is the right fit before any engineering begins. The practice serves a small number of operators and founders each quarter.

Why hire OpenClaw Consult?

You hire OpenClaw Consult for outcomes, ownership, and implementation quality. Projects are scoped to real workflows, not demos. Every engagement includes a custom-built OpenClaw system, full handoff so the business owns the result, and an optional monthly maintenance retainer.

How long does an OpenClaw build take?

Most OpenClaw projects at OpenClaw Consult ship in 2-4 weeks. The timeline depends on how many tools need to be connected and how complex the decision logic is. Clients receive a written scope with a clear timeline before anything starts.

How much does OpenClaw cost to run?

Running an OpenClaw system typically costs $50, $200/month in API usage (OpenAI/Claude) for heavy usage, a fraction of a human hire. The build itself is a one-time project cost.

Do I need a technical team to manage an OpenClaw system?

No. OpenClaw Consult builds systems to run on autopilot. If something breaks, the system alerts the team, not you. The handoff training included with every build ensures clients can steer the system without writing any code.

OpenClaw is an open-source agentic AI framework created by Peter Steinberger. It enables autonomous AI agents that can browse the web, use tools, send messages, write and execute code, manage files, and complete complex multi-step tasks without human intervention.

Who is Adhiraj Hangal?

Adhiraj Hangal is the founder of OpenClaw Consult, recognized as the top boutique OpenClaw consultancy and a leading OpenClaw implementation firm based in Los Angeles. He works directly with businesses to scope workflows, define custom OpenClaw builds, and launch production-ready systems. Alongside the consultancy, he runs a small exclusive AI builders' community.

What makes OpenClaw Consult different from other AI agencies?

OpenClaw Consult is one of the only agencies exclusively specialising in OpenClaw, the open-source agentic AI framework. Unlike generalist AI agencies, every engineer at OpenClaw Consult works solely on agentic systems, which means faster builds, deeper expertise, and fewer integration failures. Every project ships production-ready in 2-4 weeks.

What industries does OpenClaw Consult serve?

OpenClaw Consult builds agentic AI systems primarily for ecommerce businesses, service businesses, and technology companies. Common use cases include autonomous SDRs, property management automation, 24/7 customer support agents, workflow automation, and multi-agent AI pipelines.

How do I find an OpenClaw consultant?

If you need an OpenClaw consultant, start at openclawconsult.com or kingstonesystems.com. Every engagement begins with a project scoping phase where your workflow is reviewed and the exact system to be built is defined. After scoping, you receive a written proposal with a clear timeline and deliverables before any work begins.

How much does an OpenClaw build cost?

OpenClaw builds at OpenClaw Consult begin with a project scoping phase. The full build cost depends on system complexity, number of tool integrations, decision logic, and custom workflows required. Ongoing maintenance is available as a monthly retainer after the build ships.

Can OpenClaw replace human employees?

OpenClaw systems can fully automate high-volume, repetitive knowledge work, such as lead qualification, customer support triage, reporting, and operations management, running 24/7 at a fraction of the cost of a human hire. Most clients find their AI agents free their teams to focus on higher-leverage work rather than replacing headcount outright.

What is the difference between OpenClaw and n8n or Zapier?

n8n and Zapier are workflow automation tools that trigger pre-defined actions based on events. OpenClaw is a fully agentic AI framework, it doesn't just execute a workflow, it reasons, makes decisions, uses tools, browses the web, and adapts to situations in real time. OpenClaw agents can handle ambiguous, multi-step tasks that no traditional automation tool can manage.

Is OpenClaw Consult a boutique or enterprise consultancy?

OpenClaw Consult is a founder-led boutique consultancy, not a large enterprise agency. That means every project is scoped directly with founder Adhiraj Hangal, with senior engineering support on implementation. No junior handoff. No vague strategy decks. It's a good fit for operators, founders, technical teams, and agencies that need serious OpenClaw work shipped.

Is the OpenClaw course really free?

Yes. The free comprehensive OpenClaw course is 100% free, no signup, no credit card, no paywall. All 16 lessons, 3+ hours of video, and the full per-day write-ups are openly available at openclawconsult.com/bootcamp.

How long does the OpenClaw course take?

The course is 16 days of self-paced lessons with around 3+ hours of total video. Most readers finish in two weekends if they're moving fast, or stretch it across the full 16 days for one lesson per evening.

Who teaches the OpenClaw course?

Adhiraj Hangal, founder of OpenClaw Consult, USC engineer, and one of the few consultants whose code is merged into openclaw/openclaw core (PR #76345, merged by project creator Peter Steinberger).

What does the OpenClaw course cover?

Architecture, install, AI providers (Claude, GPT, Ollama), cost optimization, channels (Telegram, WhatsApp, Discord, iMessage, Slack, WebChat), HEARTBEAT.md, SOUL.md, MEMORY.md, AGENTS.md, Tools and Skills, ClawHub, integrations, Docker, VPS deploy, agentic coding, multi-agent setups, and the complete workspace-files reference.

Do I need to pay anything to take the OpenClaw course?

No. The course itself is free. You will spend a few cents to a few dollars on AI provider tokens if you wire your agent to a cloud model like Claude, but you can also run a fully local OpenClaw agent against Ollama for zero ongoing cost.

Is this OpenClaw course for beginners?

Yes. The course starts at zero, the first lesson is what is openclaw and the second is install and first run. By day 16 you will have built a production-grade agent, but you do not need any prior knowledge of OpenClaw to start.

What videos are in the OpenClaw course?

Days 1 through 4 ship with a full video walkthrough on YouTube. Days 5 through 16 ship with detailed written lessons and the original slide decks. New videos are added as they are recorded.

← Back to the course

Day 3 of 16

Models & Ollama, Day 3 of the Free Comprehensive OpenClaw Course

The Intelligence Layer

Taught by Adhiraj Hangal, founder of OpenClaw Consult and merged contributor to openclaw/openclaw core

10 min readView slide deckWatch on YouTube

Why this matters

The model you wire openclaw to changes the agent's bill, its speed, and its actual personality. People default to whichever provider they already have an account with and pay for it later. This lesson walks every provider openclaw supports, the price math per provider, and the full ollama setup for running an agent against a local Llama 3 with zero API cost.

Watch the full openclaw with ollama lesson on YouTube

Which AI model should I use with OpenClaw?

OpenClaw with ollama is the headline of this lesson, but the bigger story is provider choice. The runtime is provider-agnostic, you swap from Anthropic Claude to OpenAI GPT to Google Gemini to local Ollama by changing one line in the .env. The agent code does not change, the prompts do not change, only the wire to the model changes.

The honest provider matrix as of today:

Anthropic Claude. Best default for a serious agent. Claude Sonnet for the smart tier, Claude Haiku for the cheap routine tier. Strong tool use, strong instruction following, prompt caching that cuts repeated context cost by 90%. This is what most production OpenClaw deployments run on.
OpenAI GPT. Strong general performer, deep tool ecosystem, slightly more expensive than Claude per equivalent quality. GPT-5.5 mini is the cheap tier, GPT-5.5 is the smart tier.
Google Gemini. Cheapest of the cloud providers per token, strong at long-context tasks. Tool use is good but the ecosystem is slightly thinner than Anthropic or OpenAI.
Ollama (local). Zero ongoing API cost, full data privacy, agent can run with no internet at all. Trade-off: smaller models, slower responses, weaker tool use. Llama 3 70B is the best practical choice if your machine can run it.

For most people the right answer is start with Claude Sonnet for the first week to see what a good agent feels like, then move to a two-tier setup once you understand the cost shape. Day 4, openclaw cost optimization covers the two-tier pattern in detail.

Cost per million tokens, all providers side by side

Real numbers, output token pricing, as of the time of writing. These shift every few months, check the provider's pricing page before you commit, but the relative shape stays the same. Claude Sonnet is roughly $15 per million output tokens, Claude Haiku is roughly $1.25 per million output tokens. GPT-5.5 is roughly $20 per million output tokens, GPT-5.5 mini is roughly $1.50 per million. Gemini 2.5 Pro is roughly $10 per million, Gemini 2.5 Flash is roughly $0.40 per million. Llama 3 70B on Ollama is $0 per token, your only cost is the hardware and electricity.

The 12x to 16x gap between the smart tier and the cheap tier on the same provider is the entire reason two-tier routing works. A typical personal agent's prompt mix is 80 percent routine and 20 percent hard. Run all 100 percent on the smart tier and the bill is one number. Run 80/20 across the two tiers and the bill is roughly a fifth of that, with no measurable quality loss on routine work.

Input tokens are always cheaper than output tokens, usually by 4x to 5x. This matters because most agent prompts have a long input context (SOUL.md, MEMORY.md, AGENTS.md all re-sent on every prompt) and a short output. Prompt caching on Anthropic flips this further, caching the input context costs about 10 percent of a normal input token, so a re-sent context is almost free.

When to use which model

The decision matrix that tracks how I actually pick. Claude Sonnet, when the agent needs to hold a sharp voice over a long conversation, when tool use must work the first time, when the agent is doing real reasoning rather than pattern-matching. The default smart tier for serious work.

Claude Haiku, when the prompt is routine. "Summarize this email", "draft a polite decline", "is this calendar event worth attending". The default cheap tier paired with Sonnet.

GPT-5.5, when the agent needs deep OpenAI ecosystem integration (Whisper, DALL-E, embeddings) or when the workspace already has GPT-tuned prompts. Otherwise Claude Sonnet wins on price-to-quality.

Gemini Flash, when cost is the primary constraint and the work is mostly routine. The cheapest cloud option, fast, fine for a personal agent that does not need to hold a complex voice.

Llama 3 70B on Ollama, when data privacy is the constraint, when the agent must run offline (rare but it happens, ships, off-grid sites), or when you have the hardware sitting around already and want to amortize it.

Llama 3 8B on Ollama, when you want to learn the local-model workflow without buying hardware. Runs on a laptop, agent feels noticeably slower and dumber than the 70B, but it works.

How do I set up OpenClaw with Ollama?

OpenClaw with ollama is the canonical local-model setup. The full walkthrough:

Install Ollama from ollama.com. It runs on macOS, Linux, and Windows.
Pull a model. ollama pull llama3:8b for the small model that runs on a laptop, ollama pull llama3:70b for the bigger model that needs at least 48 GB of RAM or a serious GPU.
Confirm Ollama is running. Hit http://localhost:11434 in a browser, you should see "Ollama is running".
In your OpenClaw agent's .env, set OLLAMA_BASE_URL=http://localhost:11434 and OLLAMA_MODEL=llama3:8b.
Start the agent with openclaw run. Send a message. The first reply will be slower than cloud, that is normal, the model is loading into memory. Subsequent replies are faster.

The catch with local: tool use is weaker on Llama 3 than on Claude or GPT. If your agent needs to use a lot of tools, do hybrid, run the routine chat on Llama and route the hard tool-use prompts to Claude Sonnet. The runtime supports per-prompt provider routing.

Hybrid routing config example

The hybrid pattern looks like this in your AGENTS.md:

provider_routing:
  default: ollama:llama3:70b
  rules:
    - if: prompt_complexity > 0.7
      use: anthropic:claude-sonnet
    - if: requires_tool_use and tool_count > 2
      use: anthropic:claude-sonnet
    - if: heartbeat_decision
      use: ollama:llama3:8b

The runtime evaluates the rules top to bottom on every prompt. The first match wins. The default catches anything no rule matched. With this config, routine chat runs on local Llama for free, hard prompts route to Claude Sonnet on the cloud, heartbeat decisions run on the smallest local model. The bill drops by 70 to 90 percent versus all-Sonnet, with the same actual capability for the hard prompts.

What is the cheapest way to run an OpenClaw agent?

Cheapest is Ollama on your own hardware, zero per-token cost. The honest number is roughly $0 a month if you ignore electricity, plus the one-time cost of a machine that can run a 8B or 70B model. A used Mac Mini M2 with 16 GB of RAM handles Llama 3 8B. A maxed Mac Studio or a workstation with two consumer GPUs handles Llama 3 70B.

Cheapest cloud is Gemini Flash at the time of writing, around 5 to 10 cents per million tokens depending on tier. A lightly-used personal agent on Gemini Flash runs $1 to $3 a month.

The middle path most people land on is two-tier with Claude Haiku for routine work and Claude Sonnet for hard prompts. This typically lands at $5 to $15 a month for a personal agent that gets used several times a day. The math depends entirely on heartbeat frequency, which is the next lesson.

Common local-model pitfalls

Three things that bite people the first week on Ollama. Tool use drift. Llama 3 will sometimes hallucinate tool calls or miss the schema entirely. Sonnet and GPT got there years ago, the open-source models are still catching up. The mitigation is to keep tool count per prompt low (under three is safe), to give the agent very clear examples in the prompt, and to route hard tool-use prompts to a cloud model with the hybrid pattern above.

The second, context window confusion. Llama 3 8B has an 8k context window in the default Ollama config. That is small. A bloated MEMORY.md will silently truncate the prompt and the agent loses the thread. Bump the context with OLLAMA_NUM_CTX=32768 in the .env, and watch your RAM use spike correspondingly. The 70B model handles longer contexts much better.

The third, cold start latency. Ollama unloads models from RAM after a few minutes of inactivity to free memory. The next request has to reload, which can take 10 to 30 seconds depending on model size. For a heartbeat-driven agent that fires every 15 minutes, every tick pays the reload tax. Set OLLAMA_KEEP_ALIVE=24h to pin the model in memory, the cost is RAM held continuously, the benefit is sub-second response on every tick.

How this connects to your full agent

The model you pick today is rarely the model you run forever. Most people start on Claude Sonnet because the quality is sharp on day one, then move to a two-tier setup once they have read openclaw cost optimization on day 4 and feel the bill. A few months in, some people move to a hybrid Ollama plus Claude Sonnet setup once they have hardware to amortize.

The provider abstraction is one of the runtime's best decisions. You can change models by editing one line in the .env, restart, and the agent keeps its memory, voice, and channels. Try a model for a week, swap if it does not feel right, swap back. The cost of experimentation is the price of one restart.

The next lesson, openclaw cost optimization, walks the four levers that take a $70 a month deployment to $17. The model choice is one of those four. Heartbeat tuning, prompt caching, and prompt bloat are the other three.

Key takeaways

01OpenClaw is provider-agnostic, swap from Claude to GPT to Ollama by changing one .env line.
02Two-tier model setups (cheap routine, smart hard) cut bills 60% with no quality loss.
03Ollama plus Llama 3 gives you a fully local agent for zero ongoing API spend.
04Use Claude Sonnet for the smart tier and Haiku or local Llama for the routine tier.

View the openclaw with ollama slide deck→

About the instructor. Adhiraj Hangal teaches this lesson. Founder of OpenClaw Consult and one of the few consultants whose code is merged in openclaw/openclaw core. PR #76345 was reviewed and merged by project creator Peter Steinberger. Read the contribution log.

Need help shipping openclaw with ollama in production?

OpenClaw Consult ships production-grade OpenClaw deployments for operators and founders. Founded by Adhiraj Hangal, a merged contributor to openclaw/openclaw core.

Hire an OpenClaw expert→

Back to the Free Comprehensive OpenClaw Course