🎶 GestureCap Demo

Real-time markerless gesture & pose-based music interaction

A low-latency, multimodal gesture-recognition system using MediaPipe, enabling hand and full-body pose gestures to drive real-time sound generation. This project explores music–movement–dance interactions and serves as a technical demo aligned.

🚀 Overview

GestureCap Demo is a real-time interactive system that translates human movement into sound using a webcam—without markers, wearables, or sensors.

The system supports:

✋ Hand gestures for fine motor control
🧍 Full-body pose gestures for expressive, large-scale interaction
⚡ Low-latency audio feedback (<35 ms)
📊 Quantitative performance logging for research evaluation

This demo acts as a foundation for artistic, neuroscientific, and therapeutic applications involving embodied interaction.

✨ Key Features

✋ Hand-Based Gestures

Open palm
Fist
Rule-based finger counting using MediaPipe Hands

🧍 Pose-Based Gestures

left_arm_up
right_arm_up
both_arms_up Detected using MediaPipe Pose landmarks.

🔊 Sound Interaction

Gesture-triggered sound generation
Non-blocking, threaded audio playback
Distinct pitch mapping per gesture

⚡ Real-Time Performance

FPS: ~28–35
End-to-end latency: ~25–35 ms
Optimized MediaPipe configuration (model_complexity=0)

🧠 Interaction Stability

Gesture debouncing with cooldown
Multimodal gesture priority handling

📊 Evaluation & Logging

Per-gesture latency measurement
Automatic CSV logging (latency_log.csv)
Console-based real-time feedback

🧠 System Architecture

Webcam
  ↓
MediaPipe (Hands + Pose)
  ↓
Gesture Detection Logic
  ↓
Gesture State / Debounce
  ↓
Audio Engine (Non-blocking)
  ↓
Latency Measurement + Logging

The architecture is modular, making it easy to extend with:

MIDI / OSC output
Machine-learning gesture classifiers
Continuous control (pitch / volume modulation)

📁 Project Structure

gesturecap-demo/
│
├── src/
│   ├── capture/        # Camera & FPS handling
│   ├── vision/         # Hand & pose tracking
│   ├── gestures/       # Gesture logic & state
│   ├── audio/          # Sound & TTS engines
│   ├── evaluation/     # Latency tracking & logging
│   └── main.py         # Application entry point
│
├── config/             # Gesture & audio mappings
├── latency_log.csv     # Generated performance logs
├── requirements.txt
└── README.md

🛠️ Installation

1️⃣ Clone the repository

git clone https://github.com/<your-username>/gesturecap-demo.git
cd gesturecap-demo

2️⃣ Create and activate a virtual environment

python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux / macOS
source .venv/bin/activate

3️⃣ Install dependencies

pip install -r requirements.txt

▶️ Running the Demo

python src/main.py

Controls

Show hand or raise arms in front of the camera
Press ESC to exit

🎮 Gesture Mapping (Current)

Gesture	Interaction
Open Palm	Mid-frequency tone
Fist	Low-frequency tone
Left Arm Up	Bass tone
Right Arm Up	High-frequency tone
Both Arms Up	Strong accent tone

📊 Performance & Evaluation

Latency is measured per gesture event
Results are logged to latency_log.csv
- Typical results on CPU:
- Latency: 25–35 ms

FPS: 28–35 This makes the system suitable for real-time musical interaction.

🎯 Motivation & Use Cases

🎵 Gesture-driven musical instruments
💃 Dance-controlled sound generation
🧠 Research on embodied cognition & agency
🧑‍⚕️ Therapeutic & rehabilitative interaction systems

📜 License

MIT License

🙌 Author

Pranav Ghorpade

Electronics & Telecommunication Engineering Open-source & GSoC aspirant

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
config		config
docs		docs
notebooks		notebooks
src		src
tests		tests
LICENSE		LICENSE
README.md		README.md
latency_log.csv		latency_log.csv
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎶 GestureCap Demo

Real-time markerless gesture & pose-based music interaction

🚀 Overview

✨ Key Features

✋ Hand-Based Gestures

🧍 Pose-Based Gestures

🔊 Sound Interaction

⚡ Real-Time Performance

🧠 Interaction Stability

📊 Evaluation & Logging

🧠 System Architecture

📁 Project Structure

🛠️ Installation

1️⃣ Clone the repository

2️⃣ Create and activate a virtual environment

3️⃣ Install dependencies

▶️ Running the Demo

Controls

🎮 Gesture Mapping (Current)

📊 Performance & Evaluation

🎯 Motivation & Use Cases

📜 License

🙌 Author

Pranav Ghorpade

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎶 GestureCap Demo

Real-time markerless gesture & pose-based music interaction

🚀 Overview

✨ Key Features

✋ Hand-Based Gestures

🧍 Pose-Based Gestures

🔊 Sound Interaction

⚡ Real-Time Performance

🧠 Interaction Stability

📊 Evaluation & Logging

🧠 System Architecture

📁 Project Structure

🛠️ Installation

1️⃣ Clone the repository

2️⃣ Create and activate a virtual environment

3️⃣ Install dependencies

▶️ Running the Demo

Controls

🎮 Gesture Mapping (Current)

📊 Performance & Evaluation

🎯 Motivation & Use Cases

📜 License

🙌 Author

Pranav Ghorpade

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages