YouTube Transcript Analyzer — Automated highlights from your favorite channels.
YT Digest is a Python script that monitors a YouTube channel, fetches transcripts from long-form videos, and sends them to Google Gemini for top-5 highlights analysis. Everything is stored in a local SQLite database for easy retrieval.
- RSS Feed Monitoring — Polls a YouTube channel's RSS feed for recent videos
- Smart Video Filtering — Skips Shorts, clips under 30 min, and members-only videos; no upper duration limit
- Self-Installing — On first run, copies itself into
%APPDATA%\YTDigestso everything lives in one place - Transcript Retrieval — Fetches auto-generated or manual transcripts via YouTube's API
- AI-Powered Highlights — Sends transcripts to Google Gemini 2.5 Flash for top-5 highlights
- Transcript Retry — Videos without transcripts are retried over 48 hours (9 attempts at 6-hour intervals) before giving up
- Manual Summary — Import a Whisper (or other) transcript via
--summarizeto generate highlights for videos with no auto-transcript - SQLite Storage — Stores titles, dates, transcripts, and summaries permanently
- Rate Limiting — 10-minute minimum between transcript fetches, tracked persistently on disk (Gemini calls are not rate-limited)
- 3-Day Lookback — On first run (or after outage), catches up on recent videos chronologically
- Auto-Dependency Check — Verifies and auto-installs pip packages on startup
- External Config — All settings in
config.ini(no hardcoded API keys)
OpenClaw (Cron) → yt_digest.py
├── YouTube RSS Feed
├── Video Filter (Shorts / Duration)
├── YouTube Transcript API
├── Google Gemini 2.5 Flash
└── SQLite Database
git clone https://github.com/Digitalgods2/YTdigest.gitcd YTdigest
python yt_digest.pyThat's it. The script handles everything automatically:
- Installs dependencies — auto-detects missing pip packages and installs them
- Copies itself to AppData — installs
yt_digest.pyandrequirements_ytdigest.txtinto%APPDATA%\YTDigest\ - Prompts for setup — asks for your YouTube Channel ID and Gemini API key, saves to
config.ini - Starts processing — fetches videos, transcripts, and generates highlights
After the first run, everything lives in %APPDATA%\YTDigest\ and the clone folder can be deleted.
Finding a Channel ID:
yt-dlp --print channel_id --playlist-items 1 "https://www.youtube.com/@ChannelHandle"Getting a Gemini API key: Sign up free at Google AI Studio
Set up as a scheduled task in OpenClaw (Clawdbot) with cron: 0 */6 * * *
Run the command: python "%APPDATA%\YTDigest\yt_digest.py"
%APPDATA%\YTDigest\
├── yt_digest.py # Main script
├── requirements_ytdigest.txt # Dependencies
├── config.ini # API keys & settings (auto-created)
├── ytdigest.db # SQLite database
├── ytdigest.log # Execution log
└── last_api_call.txt # Rate limiter state
| Column | Type | Description |
|---|---|---|
video_id |
TEXT (PK) | YouTube video ID |
title |
TEXT | Video title |
published |
DATETIME | Original publish date (ISO 8601) |
processed_at |
DATETIME | When YT Digest processed it |
status |
TEXT | done, pending_transcript, or no_transcript |
summary |
TEXT | Gemini's top 5 highlights |
transcript |
TEXT | Full transcript with timestamps |
retry_count |
INTEGER | Number of transcript fetch attempts |
| Status | Meaning |
|---|---|
done |
Transcript fetched and Gemini highlights generated |
pending_transcript |
No transcript yet — will retry automatically every 6 hours |
no_transcript |
Gave up after 9 attempts (~48 hours) — use --summarize for manual import |
If a video ends up as no_transcript, you can provide your own transcript (e.g. from Whisper) and generate highlights:
python yt_digest.py --summarize VIDEO_ID path/to/transcript.txtThis imports the transcript, sends it to Gemini for analysis, and marks the video as done.
Videos are classified through a multi-step pipeline:
- Thumbnail check —
hq2.jpgpattern indicates a Short - URL check —
/shorts/in the link - Duration check — Fetched from video page meta tag (
PT1H30M45Sformat)
| Result | Criteria |
|---|---|
| Skip | Shorts (any detection method) |
| Skip | Under 30 minutes |
| Skip | Members-only (title contains [member access] or [members only]) |
| Process | 30 minutes and up (no upper limit) |
| Package | Min Version | Purpose |
|---|---|---|
feedparser |
6.0.0 | RSS feed parsing |
requests |
2.25.0 | HTTP requests for duration detection |
youtube-transcript-api |
1.2.0 | Transcript fetching |
google-genai |
1.0.0 | Gemini API SDK |
See YT_Digest_Installation_Guide_v3.pdf for the full installation and reference guide, including architecture diagrams, OpenClaw scheduled task setup, and troubleshooting.
MIT