In This Article
Introduction
ClawdTalk enables users to call a phone number and speak directly to their OpenClaw agent. Hands-free calendar updates, Jira ticket checks while driving, or quick information lookup — all via voice. The agent uses speech-to-text for input and text-to-speech for response, with the same memory and skills as the messaging interface.
How It Works
Flow:
- User calls a dedicated phone number (e.g., via Telnyx or Twilio)
- VoIP provider receives call, streams audio to OpenClaw
- OpenClaw transcribes via Whisper or similar STT
- Agent processes as normal (memory, tools, LLM)
- Response synthesized via TTS, streamed back to caller
The agent has full context — it knows who's calling (if configured), recent conversations, and can perform any skill (calendar, Jira, email) that the messaging agent can.
Use Cases
- Driving: "Add meeting with John at 3 PM tomorrow" — no need to pull over
- Hands-full: Cooking, exercising — voice is the only interface
- Accessibility: Users who prefer or require voice interaction
- Quick checks: "What's my next meeting?" "Any urgent emails?"
Integration
ClawdTalk typically uses Telnyx or Twilio for VoIP. Webhook receives incoming call event; OpenClaw connects to the call via WebRTC or similar. Audio streams bidirectionally. Session ends when user hangs up.
Security: Authenticate caller (e.g., caller ID whitelist, or PIN). Don't expose unauthenticated voice access — same risks as unauthenticated Gateway.
Setup
- Telnyx/Twilio account; purchase phone number
- Configure webhook to point to OpenClaw ClawdTalk skill
- STT/TTS: Whisper + ElevenLabs, or provider-native
- Cost: ~$0.01/min voice + API costs for STT/LLM/TTS
The setup is straightforward if you're already running OpenClaw. The ClawdTalk skill connects your existing agent to the phone network. You're not building a new agent — you're adding a new channel. The same memory, skills, and personality work over voice. See voice agent for the full architecture.
Security Considerations
Voice access is powerful. Anyone who has your ClawdTalk number can call and potentially interact with your agent. Mitigations: (1) Caller ID whitelist — only allow known numbers. (2) PIN or passphrase — require the caller to say a code before the agent responds. (3) Rate limiting — prevent abuse. (4) Log all calls — audit trail. The same principles as Gateway authentication apply. Don't expose unauthenticated access.
Cost Breakdown
Telnyx/Twilio: ~$0.01/min for voice. Whisper (STT): ~$0.006/min. ElevenLabs (TTS): ~$0.02–0.05 per 1K characters. LLM: depends on model and token count. A 5-minute call might cost $0.50–1.50 total. For occasional use, negligible. For heavy users, consider caching common responses or using cheaper TTS. See cost pricing for the full breakdown.
Wrapping Up
ClawdTalk extends OpenClaw to voice — the same agent, new interface. Essential for hands-free and accessibility use cases. See OpenClaw voice agent for technical details and personal assistant for setup patterns.