Video-RAG

My implementation of Video-RAG

Introduction:

VideoRAG (Retrieval-Augmented Generation for Video) is an advanced AI technique that enhances video-based information retrieval and understanding. It combines retrieval-augmented generation (RAG) with video processing capabilities, allowing AI models to generate accurate, context-aware responses based on video content.

Demo

Link

Implementation:

Video is converted to frames
Frames are passed through YoLoV11 for Object Detection.
- Frames are broken down in the format <video_id>_frame_<timestamp>.jpg
- Transcript is downloaded in WebVTT format <video_id>_captions.txt
Frames (with Bounding Boxes) are then passed to the VLM for enhanced visual description, accurate descriptions.
- Frames are saved in result_<video_id>_<timestamp>.jpg
Visual Descriptions go through nomic-embed-text-v2-moe(local) and snowflake-arctic-embed2(hosted) (multilingual embedding) for efficient search across languages, we can query in hindi and english both to get the results.
- Payload is made with description embedding, metadata with yt_url, timestamp: yt url with timestamp to directly play the video from that segment, desc: The description.
- Payload is sent to Pinecone.

High Level Architecture Diagram

How to Setup

Step 1 : Setup python environment. (Preferred 3.11+)

uv venv /app/video_rag
source /app/video_rag/bin/activate

Step 2: Install the dependencies

pip install -r requirements.txt

Step 3: Run the script

python main.py

Contact Me

In case of any issues please reach out me at [email protected] or raise a Github Issue.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
backend		backend
data/ftDsSB3F5kg		data/ftDsSB3F5kg
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
video_rag_1.png		video_rag_1.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video-RAG

Introduction:

Demo

Implementation:

High Level Architecture Diagram

How to Setup

Contact Me

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Video-RAG

Introduction:

Demo

Implementation:

High Level Architecture Diagram

How to Setup

Contact Me

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages