Voice assistant with ~300ms latency. Record your screen to report bugs or request features. AI sees, hears, and fixes.
🎬 Watch the demo — voice chat + screen recording in action
Every feature is designed for speed, privacy, and actually getting things done.
Native speech-to-speech via OpenAI's Realtime API. No Whisper → GPT → TTS pipeline. Just instant, natural conversation.
Voice AI is the fast front desk. Slack AI is the back office. Tasks get handed off seamlessly — you keep talking while work happens.
Install on any device. Conversations sync via SQLite. Start on your phone, continue on your laptop. Works offline.
Neon cyberpunk, clean light, and dark mode. Full visual customization because your AI should look as good as it sounds.
Paste a URL and Clawd reads it aloud with natural voice. Perfect for articles, docs, and content consumption on the go.
Send images directly in voice chat. Clawd sees and describes them, answers questions about what's in the image — all by voice.
Record your screen, narrate the bug, stop — AI extracts frames, transcribes your voice, and analyzes the issue with GPT-4o Vision. Fix suggestions in seconds.
Voice AI handles real-time conversation. Slack AI handles real work. They talk to each other so you don't have to wait.
Powered by OpenAI Realtime API. Handles conversation, answers questions instantly, and knows when to hand off tasks that need deeper work.
A full AI team in Slack. Runs code, searches the web, manages files, sends emails — real work that happens in the background while you keep talking.
┌──────────────────────────┐ ┌──────────────────────────┐ │ 📱 Your Device │ │ 💬 Slack Workspace │ │ │ │ │ │ PWA / Browser (any) │ │ AI Agents (Clawd & co) │ │ WebRTC Audio Stream │ │ File ops, web, code │ │ Push-to-talk / VAD │ │ Email, calendar, etc. │ └────────────┬─────────────┘ └────────────┬─────────────┘ │ wss:// │ Slack API ▼ ▼ ┌─────────────────────────────────────────────────────────────┐ │ 🖥️ Express Server │ │ │ │ WebSocket ↔ OpenAI Realtime API Slack Bot Integration │ │ Session Manager Task Queue & Results │ │ SQLite (conversations) Cloudflare Tunnel │ └─────────────────────────────────────────────────────────────┘
From zero to voice-chatting with your AI in under 5 minutes.
Grab the source code from GitHub.
Just plain Node.js — no build tools, no bundlers.
Add your OpenAI API key and Slack bot token. Copy the example and fill in your keys.
Fire up Express and the WebSocket server. That's it.
Open your browser, allow mic access, and say hello. Clawd is listening. 🎙️
Show the bug instead of describing it. Record your screen, talk through the issue, and let GPT-4o Vision figure out what's wrong.
Tap the 🔴 button next to the attachment icon. Your browser will ask to share your screen. Mic audio is captured simultaneously.
Move your mouse, click through the UI, and talk. "See this button? When I click it, the drawer slides up too fast. I want it slower."
Tap ⏹️ to stop. The video uploads automatically — no extra steps. Max 60 seconds, 50MB.
The server extracts ~10 key frames + transcribes your narration with Whisper. GPT-4o Vision sees what you see, hears what you said, and speaks the fix back to you.
Tailscale creates a private network between your devices. No port forwarding, no public exposure — just install, join, and you're in.
Download Tailscale for your device. Available on macOS, Windows, Linux, iOS, and Android.
Create a free Tailscale account (Google, Microsoft, or GitHub sign-in). Then log in on the app you just installed.
I'll send you a Tailscale invite link. Click it to join the shared network (tailnet). Your device gets a private IP like 100.x.y.z.
Once connected to the tailnet, open the private URL in your browser. That's it — you're in. Works from any device on the network.
No frameworks, no magic. Just solid, well-understood technology.
Open source. Self-hosted. Your voice, your data, your rules.
⭐ Star on GitHub