A keyboard-less and mouse-less multiplayer whiteboard that uses webcam gestures and voice control for a fully hands-free collaborative drawing experience.
Magic Canvas is a real-time collaborative whiteboard that reimagines how we interact with digital canvases. Instead of using a keyboard and mouse, users control the canvas entirely through:
- Hand Gestures (via webcam): Draw, erase, and select areas using natural hand movements detected by your webcam
- Voice Commands: Change pen colors, adjust brush sizes, and generate AI images using conversational voice control
- AI Image Generation: Select any area on the canvas and generate abstract artwork that seamlessly integrates into your drawing
- Real-time Multiplayer: Multiple users can join a room and see each other's cursors, drawings, and gestures in real-time
- Gesture Controls:
- ☝️ Pointing Up: Activates pen tool and draws
- ✋ Open Palm: Activates eraser tool
- ✌️ Victory: Creates a selection area for AI generation
- 👍 Thumbs Up: Activates voice assistant
- 👎 Thumbs Down: Deactivates voice assistant
- Voice Assistant: Natural language control for:
- Changing pen colors ("make the pen blue")
- Adjusting brush size ("make the brush thicker")
- Generating images in selected areas ("create an abstract painting here")
- Smart Cursors: Each user's cursor dynamically shows:
- Their username and current gesture emoji
- Triangle cursor (colored by current pen color)
- Circle cursor (white, when eraser is active)
- Freehand Drawing: Smooth, real-time drawing with adjustable brush sizes and colors
- AI Image Generation: Select any canvas area and generate artwork using Fal.ai's image-to-image model
- Next.js 14 (App Router) - React framework for the web application
- TypeScript - Type-safe development
- Tailwind CSS - Utility-first styling
- Supabase Realtime - WebSocket-based real-time presence and broadcasts
- Presence API for multi-cursor tracking
- Broadcasts for drawing strokes, gestures, and tool changes
- Supabase Storage - Image hosting for AI-generated artwork
- Google MediaPipe (
@mediapipe/tasks-vision) - Hand gesture recognition- HandLandmarker for fingertip tracking
- GestureRecognizer for detecting 9+ hand gestures
- ElevenLabs Conversational AI (
@elevenlabs/client) - Voice assistant with custom client tools- Tool Calling: Custom client tools that execute in the browser
change_pen_color- Parses natural language color names and updates pen colorchange_brush_size- Adjusts brush thickness based on voice commandsgenerate_image- Triggers AI image generation with user-provided prompts
- Fal.ai (
@fal-ai/client) - Fast AI image generation- Model:
fal-ai/nano-banana/editfor image-to-image generation
- Model:
- HTML Canvas API - High-performance 2D drawing with:
- Real-time stroke rendering
- Compositing operations for eraser functionality
- DPR-aware scaling for crisp visuals
- Create a Room: Users create a shareable room with a unique URL
- Join with Avatar: Select an avatar and name to enter the room
- Enable Magic Mode: Activate hand gesture and voice control
- Draw & Collaborate:
- Use hand gestures to draw and erase
- Voice commands to customize colors and brush size
- Select areas with the Victory gesture to generate AI artwork
- Real-time Sync: All actions are broadcast to other users in the room via Supabase Realtime
- Node.js 18+ installed
- A webcam (for gesture control)
- A microphone (for voice control)
git clone <your-repo-url>
cd Magic-Canvasnpm install-
Create a new project at supabase.com
-
Go to Project Settings → API and copy:
- Project URL
anonpublic keyservice_rolekey (keep this secret!)
-
In the SQL Editor, run the schema from
supabase.sql:create table public.rooms ( id uuid primary key default gen_random_uuid(), slug text unique not null, created_at timestamptz default now() ); alter table public.rooms enable row level security; create policy "Allow public read" on public.rooms for select using (true); create policy "Allow service role insert" on public.rooms for insert with check (true);
-
Create a Storage bucket:
- Go to Storage → Create bucket
- Name:
whiteboard-images - Make it Public
- Sign up at elevenlabs.io
- Create a Conversational AI Agent:
- Go to Conversational AI → Create Agent
- Name your agent (e.g., "Canvas Assistant")
- Configure the agent with a helpful system prompt
- Add Client Tools to your agent (see
ELEVENLABS_SETUP.mdfor detailed JSON schemas):change_pen_color- Tool to change pen colorchange_brush_size- Tool to adjust brush sizegenerate_image- Tool to generate AI images
- Copy your:
- API Key (from Settings → API Keys)
- Agent ID (from the agent's settings)
Create a .env.local file
# Supabase
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_anon_key_here
SUPABASE_SERVICE_ROLE_KEY=your_service_role_key_here
# Fal.ai
FAL_KEY=your_fal_api_key_here
# ElevenLabs
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
ELEVENLABS_AGENT_ID=your_agent_id_herenpm run devOpen http://localhost:3000 in your browser.
- Create a Room: Click "Create Room" on the homepage
- Set Up Your Profile: Choose an avatar and enter your name
- Enable Magic Mode: Click the "Magic Mode" button to activate gesture and voice control
- Allow Permissions: Grant webcam and microphone access when prompted
- Start Creating:
- Point up ☝️ to draw
- Open palm ✋ to erase
- Victory sign ✌️ to select an area
- Thumbs up 👍 to activate voice assistant
- Say commands like "make the pen red" or "make the brush thicker"
- Gestures not working? Make sure your webcam has good lighting and your hand is clearly visible
- Voice not responding? Check microphone permissions and ensure ElevenLabs agent tools are configured correctly
- Real-time sync issues? Verify Supabase Realtime is enabled for your project
- Image generation failing? Check your Fal.ai API key and ensure you have credits
- Optimized Real-time Performance: Throttled cursor movements and debounced gesture detection for smooth 60fps rendering
- DPR-Aware Canvas: High-resolution rendering that adapts to device pixel ratios
- Streaming Strokes: In-progress drawing strokes are streamed to peers in real-time, not just on completion
- Client-Side MediaPipe: Webcam processing happens entirely in the browser for privacy and low latency
- Voice Tool Integration: ElevenLabs client tools with custom handlers for pen color, brush size, and AI generation
- Mirrored Fingertip Coordinates: Natural left/right movement with front-facing cameras
Try it live or watch the demo video to see gesture-controlled collaborative drawing in action!
Built with ❤️ using Supabase, ElevenLabs, Fal.ai, and Google Gemini