← Back to the course
Day 14 of 16

Agentic Coding, Day 14 of the Free Comprehensive OpenClaw Course

Your Agent Writes Code

Taught by Adhiraj Hangal, founder of OpenClaw Consult and merged contributor to openclaw/openclaw core
11 min readView slide deck

Why this matters

OpenClaw agentic coding is what happens when you point the agent at a real codebase, hand it the exec tool, and let it work. Done right it ships features in the background while you do other things. Done badly it deletes your repo. This lesson walks the toolkit, the prompting patterns that make the agent useful instead of dangerous, build logs, planning workflows, and the sub-agent split that keeps a planner separate from an executor.

Can OpenClaw write production code?

OpenClaw agentic coding is what happens when you point the agent at a real codebase, hand it the exec tool, and let it work. Yes, it can write production code. Yes, it can ship features in the background. Yes, it can also delete your repo if you let it. The difference is the pattern, not the model.

The core toolkit is four tools: read (file contents), write (overwrite a file), edit (precision change to a file), and exec (run a shell command). Every agentic coding workflow is built on these four. The model decides what to read, what to change, and which commands to run, the runtime executes the calls.

The honest production envelope: agentic coding works well for changes that are well-bounded ("add a route at /api/foo that returns the bar data"), test-covered (the agent can run the tests after each change), and reviewable (you read the diff before merging). It works badly for sprawling refactors, undocumented codebases, and changes that cannot be verified by tests. Match the pattern to the work.

How is OpenClaw different from Cursor or Copilot?

The shape difference is huge. Cursor and Copilot are interactive IDE tools, you stay in the loop on every keystroke. OpenClaw is an agent, it runs in the background and reports back when it is done.

The right comparison is not "which is the better autocomplete". The right comparison is "do I want to be in the loop". For typing code while thinking through it, IDE tools are better. For "ship the small feature I described overnight", an agent is better. For "review the last twenty PRs and flag the ones with bad test coverage", an agent is the only practical answer.

The other difference is environment. OpenClaw runs in your own runtime, with your own SOUL.md, your own MEMORY.md of the codebase's quirks, your own AGENTS.md saying which destructive commands are allowed. The personalization compounds, after a few weeks of using one agent against one codebase, the agent knows the codebase the way you do.

Cursor versus Claude Code versus OpenClaw, the real comparison

The three tools cover overlapping but distinct workflows. Cursor is an IDE, you sit in front of it and code, the AI is your pair programmer. The friction is low, the loop is tight, the AI sees what you see. Best for "I am writing this myself with help".

Claude Code from Anthropic is a CLI agent, you give it a task in your terminal and it works in your repo. Lighter than OpenClaw, no persistent runtime, no cross-session memory by default. Best for "I want to delegate this single coding task right now".

OpenClaw is the long-running version. Persistent memory of your codebase, a heartbeat that lets it work in the background while you do other things, sub-agent split for safety, the ability to wire it into your existing channels and tools. Best for "I want a coding agent that knows my codebase the way a senior engineer who has been here for a year would".

The honest answer for most people is to use all three. Cursor for live pair programming, Claude Code for one-off CLI tasks, OpenClaw for the agent that lives with you and your codebase across days and weeks. They are complementary, not competing.

What is the safest pattern for an OpenClaw coding agent?

The pattern that has held up across every client deployment I have done.

Two agents, planner and executor. The planner has read-only access to the codebase, it reads the issue, reads the relevant files, and produces a written plan. The executor takes the plan and the write/exec tools and implements it. The executor cannot read anything outside its scoped working directory. If a plan is wrong, only the executor's bounded change goes wrong.

Sandbox the executor. Run it inside Docker (openclaw docker) with a writeable mount of just the repo, no access to your home directory, no access to your shell history, no access to your SSH keys. The exec tool inside the sandbox can do whatever it wants to the repo and nothing else.

Build log file. The agent writes everything it tried to a build-log.md inside the sandbox. What it ran, what failed, what it tried next. This is how you debug an autonomous coding pass that went wrong, you read the build log, find the wrong turn, fix the prompt, try again.

Tests must pass before merge. The agent's done condition is "tests pass". If tests do not pass, the agent loops on the failure, it does not produce a "done" output. You wire this with the exec tool calling your test runner.

Example workflow, end to end

A real workflow I run on a Next.js client codebase. The user files an issue: "Add a /api/health endpoint that returns 200 and the current commit SHA."

  1. The user pastes the issue into the agent's Slack channel.
  2. The planner agent reads the issue, reads the existing /api/* files for the project's conventions, reads the package.json, reads the test setup. Writes plan.md with a four-step plan: create the file, write the handler matching the project's style, add a test, run the test suite.
  3. The user reads plan.md, approves with "go".
  4. The executor agent reads plan.md, creates app/api/health/route.ts, adds the test in tests/api/health.test.ts, runs npm test.
  5. Tests pass. The agent commits with a clear message, pushes a branch, opens a PR, posts the PR link to Slack with a one-line summary.
  6. The user reviews the PR diff, the test output is visible in the PR check, merges.

Total time from "go" to merged PR: usually 5 to 15 minutes for a small change like this. The agent is working in the background, the user is doing other things. The compounding payoff is real, after a few weeks the agent has shipped maybe 30 small PRs the user would otherwise have done by hand.

Common agentic coding pitfalls

Three failure modes that show up most. The first, agent thrashes on the wrong problem. The issue is ambiguous, the agent picks an interpretation, codes it, tests pass, but it solved a different problem than the user wanted. Mitigation: require the planner to write the plan in plain English first and ask the user "does this match what you meant" before the executor starts.

The second, tests pass but code is wrong. The agent gamed the tests by mocking too aggressively or by testing the wrong thing. Mitigation: code review the diff, do not trust the green check alone. The agent's PR description should include a one-line "what changed" summary and a "what could go wrong" section.

The third, agent commits secrets. The agent finds an API key in a comment, decides it is part of the code, includes it in the commit. Mitigation: a pre-commit hook (gitleaks, trufflehog) that blocks any commit containing high-entropy strings. Run the hook in the executor's sandbox before allowing the push.

Planning workflow

The single biggest quality lever in agentic coding is whether the agent plans before it codes. A plan-first agent reads the issue, reads the relevant files, produces a step-by-step written plan, and only starts editing once the plan is laid out. A no-plan agent jumps straight to editing and usually thrashes.

The way to enforce planning is in AGENTS.md or in the planner sub-agent's prompt: "Before any write or exec call, you must produce a numbered plan of what you intend to do, written to plan.md. After the plan, ask the user (or the executor sub-agent) to acknowledge before proceeding."

This pattern is also why the two-agent split works. The planner and executor split forces the plan to be written down, in a file the executor reads, and the executor cannot start without it.

When to skip the coding agent and just write it yourself

The honest list of when agentic coding is the wrong tool. Hot fixes during an incident. The agent is too slow. You need the fix in 90 seconds, you write it. The agent's planning loop costs minutes you do not have when production is down.

Anything where you genuinely want to think. The act of writing code is how engineers think through a problem. If the work is novel and you have not internalised the design, do not delegate. The agent will produce something plausible that you do not understand, and you will have to reason backwards from the result to figure out if it is right.

Code that touches money or user data without test coverage. The agent's "tests pass" done condition only works if the tests cover the actual risk. A change to billing logic in a codebase with weak tests should never go through an agent. Write it yourself, review it twice, ship it slowly.

Highly idiomatic code in an unusual style. The agent will produce conventional code in the language. If your codebase has unusual patterns (a custom DSL, an unusual architecture choice), the agent's output will fight the existing style. The cost of teaching the agent the style is higher than just writing it.

How this connects to your full agent

Agentic coding is the highest-stakes capability the agent can have. Get the safety pattern right (planner-executor split, sandbox, tests-must-pass) and the agent ships real PRs in the background. Get it wrong and the agent destroys your repo or commits secrets to the public. The pattern matters more than the model choice.

The right next read is openclaw multi-agent on day 15, which generalises the planner-executor split into the broader pattern of running a team of specialized agents. The dependencies are openclaw agents.md on day 9 (the gates that keep the executor scoped) and openclaw docker on day 12 (the sandbox that contains the executor's writes).

If you have not run a coding agent before, start small. One bug fix. One small endpoint. One test added. Build trust in the pattern before pointing it at a refactor that touches twenty files. The compound win comes from many small reliable PRs, not from one big risky one.

Key takeaways

  • 01Exec, read, write, edit is the core toolkit, every agentic coding pattern is built on these four.
  • 02A build log file lets the agent track what it tried, what failed, and what to try next.
  • 03Split planning from execution into two sub-agents, planning gets read-only access to the repo.
  • 04Always run the agent inside a sandbox or container when it has the write tool.
View the openclaw agentic coding slide deck

About the instructor. Adhiraj Hangal teaches this lesson. Founder of OpenClaw Consult and one of the few consultants whose code is merged in openclaw/openclaw core. PR #76345 was reviewed and merged by project creator Peter Steinberger. Read the contribution log.

Need help shipping openclaw agentic coding in production?

OpenClaw Consult ships production-grade OpenClaw deployments for operators and founders. Founded by Adhiraj Hangal, a merged contributor to openclaw/openclaw core.

Hire an OpenClaw expert