In This Article
Introduction
OpenClaw deployments occasionally hit snags. Here's what we're covering: the most common issues and how to resolve them. When in doubt, check logs first — they usually point to the cause. You'll see agent not responding, API errors, messaging platform issues, resource problems, and Skill failures with specific fix steps.
Whether you're debugging your first deployment or an experienced user hitting a new issue, you'll find systematic troubleshooting steps. We'll also cover the diagnostic checklist that narrows down problems quickly.
Agent Not Responding
Messaging: Verify bot token and allowed_user_ids. Ensure your user ID is in the list. Check firewall — outbound HTTPS to api.telegram.org or WhatsApp APIs must be allowed. Regenerate tokens if unsure.
Step-by-step. 1) Check logs: docker logs openclaw or journalctl -u openclaw. Look for "received message" or "processing." No log = message not reaching agent. 2) Verify token: Test token with curl. Telegram: curl "https://api.telegram.org/bot
Heartbeat: If scheduled tasks aren't running, check heartbeat config. Verify cron expression or interval. Ensure the process is running (Docker, systemd).
Common causes. Typo in token. User ID not in allowlist. Container stopped. Wrong channel ID (Slack). Webhook URL not reachable (firewall, wrong URL).
API Errors
Rate limits: Reduce request frequency. Use a smaller model. Implement backoff. Error: "429 Too Many Requests." Fix: Increase heartbeat interval. Use GPT-4o Mini instead of GPT-4o (fewer tokens). Add exponential backoff in config if supported.
Invalid key: Verify API key, check for typos, ensure key has not been rotated. Check provider dashboard for usage and limits. Error: "401 Unauthorized" or "Invalid API key." Fix: Copy key fresh from OpenAI/Anthropic dashboard. Check for trailing spaces. Verify key has not expired. Rotate if exposed.
Model not found: Model names change. Update config to current model IDs. Check provider documentation. Error: "404 model not found." Fix: OpenAI: gpt-4o, gpt-4o-mini (check current). Anthropic: claude-3-5-sonnet-20241022 (check current). Deprecation notices are sent — update before EOL.
Context length exceeded: Too much context in request. Error: "context_length_exceeded." Fix: Prune memory. Reduce conversation history. Use smaller model with larger context (e.g., gpt-4o has 128K). Split into smaller requests.
Real example. Team hit 429 repeatedly. Heartbeat was every 5 min, 20 tasks. 240 requests/hour. OpenAI limit: 500/min for tier 1. Solution: Batch tasks. Increase interval to 15 min. Switched some to GPT-4o Mini. Resolved.
Telegram/WhatsApp Issues
Telegram: Bot token from @BotFather. User ID from @userinfobot. allowed_user_ids must include your ID. WhatsApp: Business API setup is more complex; verify webhook configuration and phone number.
Telegram specifics. Token format: 123456:ABC-DEF... No spaces. User ID: numeric, e.g., 123456789. Get from @userinfobot. allowed_user_ids: ["123456789"] or [123456789] depending on config format. Group vs DM: For groups, bot needs to be added. Check group ID. Some configs use chat_id instead of user_id.
WhatsApp specifics. Business API requires Meta approval. Webhook URL must be HTTPS, publicly reachable. Verify token. Phone number must be registered. Test with WhatsApp's test number first. Common: webhook not receiving events (check URL, SSL cert), message format wrong (templates for proactive).
Slack/Discord. Similar principles. Token, channel ID, permissions. Check OAuth scopes. Bot must be invited to channel. Verify with "test" message.
High Memory/CPU
Aggressive heartbeat or runaway loops cause this. Increase heartbeat interval. Add circuit breakers. Check for Skills that spawn heavy processes. Consider resource limits in Docker.
Symptoms. Container OOM killed. Host sluggish. Cloud bill spike. Process using 100% CPU.
Diagnosis. docker stats openclaw. top or htop. Check which process. OpenClaw itself or child (Ollama, Skill).
Fixes. 1) Heartbeat: Every 1 min is aggressive. Try 15–30 min. 2) Ollama: Local models use RAM. 7B model ~4GB. 13B ~8GB. Right-size. 3) Skills: Shell Skill running heavy command? HTTP Skill hitting slow API? Add timeouts. 4) Docker: deploy: resources: limits: memory: 2G. Prevents runaway. 5) Loop: Agent in loop? Check logs for repeated similar actions. Add max iterations. Fix prompt.
Real example. Agent used 8GB RAM. Heartbeat every 5 min, each run loaded full memory (50MB). Memory leak in conversation history. Fix: Prune history. Limit to last 20 messages. Dropped to 2GB.
Skill Failures
Check Skill-specific logs. Verify credentials and permissions. Test Skill in isolation. Some Skills require network access — ensure Docker/network config allows it. Update Skills to latest versions.
HTTP Skill. 401: Bad credentials. 403: Forbidden. 404: Wrong URL. 500: Upstream issue. Fix: Verify API key, URL, headers. Test with curl. curl -H "Authorization: Bearer KEY" https://api.example.com/endpoint.
Database Skill. Connection refused: Wrong host, port, or firewall. Auth failed: Wrong credentials. Fix: Test connection from container. docker exec openclaw ping db-host. Verify network.
File Skill. Permission denied: File not readable by process. Not found: Wrong path. Fix: Check file permissions. Mount path correctly in Docker. Use absolute paths.
General. Skill timeout: Increase timeout. Skill error in logs: Read full error. Often credential or network. Update Skill: git pull, rebuild. Community Skills get fixes.
Heartbeat Not Running
Symptoms. Scheduled tasks never run. No daily digest. No pipeline check.
Diagnosis. 1) Is process running? docker ps, systemctl status. 2) Heartbeat config: Check config.yaml. Is heartbeat section present? Correct cron/interval? 3) Logs: Any "running heartbeat" or "heartbeat task" messages? 4) Timezone: Cron "0 9 * * *" = 9am server time. Is server TZ correct?
Fixes. Cron syntax: "0 9 * * *" = 9am daily. "*/15 * * * *" = every 15 min. Test: Use "*/5 * * * *" temporarily. See if it runs. Interval: Some configs use interval: 900 (seconds). Verify format. Process: If container restarts, heartbeat restarts. Check restart policy. Ensure container stays up.
Diagnostic Checklist
- □ Check logs. Last 50 lines. Any errors?
- □ Is process/container running?
- □ Verify API key. Test with curl.
- □ Verify messaging token. Test with provider API.
- □ Check allowed_user_ids. Is user ID correct?
- □ Check firewall. Outbound 443?
- □ Check heartbeat config. Cron/interval correct?
- □ Check Skill credentials. Test in isolation?
- □ Check resource usage. OOM? CPU spike?
- □ Recent config change? Revert and test.
Frequently Asked Questions
Where are OpenClaw logs? Depends on getting it running. Docker: docker logs openclaw. Systemd: journalctl -u openclaw. Default: stdout. Configure log file in config if needed.
How do I enable debug logging? Set LOG_LEVEL=debug or similar in environment. Check OpenClaw docs for exact variable. More verbose. Use temporarily for debugging.
Agent responds slowly. Why? LLM latency (1–5 sec typical). Large context (more tokens = slower). Cold start (first request after idle). Network latency to API. Consider: smaller model, prune context, keep-alive.
Can I test without messaging? Yes. Use OpenClaw's HTTP API or CLI if available. Or: run with mock/test mode. Check docs for testing options.
Agent gives wrong answers. How to fix? Improve system prompt. Add to memory. Check if relevant context is loaded. Try different model. Add examples (few-shot). Iterate.
How do I get help? OpenClaw Discord. GitHub issues. OpenClaw Consult for paid support. Include: logs (redact secrets), config (redact secrets), steps to reproduce.
Wrapping Up
Most issues are config or credential related. Systematic troubleshooting resolves them. Logs first. Then credentials. Then config. Then infrastructure. OpenClaw Consult provides support for complex deployments — we've debugged hundreds of production issues.