Skip to content

Fix/per video retry backoff#62

Merged
kaya70875 merged 11 commits intomainfrom
fix/per-video-retry-backoff
Apr 19, 2026
Merged

Fix/per video retry backoff#62
kaya70875 merged 11 commits intomainfrom
fix/per-video-retry-backoff

Conversation

@kaya70875
Copy link
Copy Markdown
Owner

No description provided.

@kaya70875 kaya70875 self-assigned this Apr 18, 2026
@kaya70875 kaya70875 added bug Something isn't working enhancement New feature or request labels Apr 18, 2026
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Implement per-video retry backoff with recovery pass

✨ Enhancement 🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Implement per-video retry logic with configurable recovery delay
• Add tenacity-based exponential backoff for transient network errors
• Distinguish permanent vs transient failures with is_permanent_exception flag
• Create TimeoutSession to enforce request timeouts and improve error handling
• Add configurable recovery options (with_recovery, recovery_delay) to FetchOptions
Diagram
flowchart LR
  A["Fetch Transcripts"] --> B["_fetch_with_recovery_pass"]
  B --> C["Initial Fetch"]
  C --> D{"Transient Errors?"}
  D -->|Yes| E["Wait recovery_delay"]
  E --> F["Retry with Exponential Backoff"]
  F --> G["Collect Results"]
  D -->|No| G
  G --> H["Return Success & Failed"]
Loading

Grey Divider

File Changes

1. ytfetcher/_core.py ✨ Enhancement +13/-10

Update retry logic to use recovery pass

• Rename _fetch_with_retry to _fetch_with_recovery_pass for clarity
• Replace TRANSIENT_FAILURE_REASONS with RETRYABLE_ERRORS constant
• Use is_permanent_exception flag instead of checking reason strings
• Add configurable recovery delay from FetchOptions
• Add early return when recovery is disabled via with_recovery option

ytfetcher/_core.py


2. ytfetcher/_transcript_fetcher.py ✨ Enhancement +55/-6

Add timeout session and tenacity retry decorator

• Add TimeoutSession class to enforce 10-second request timeout
• Implement @retry decorator with tenacity for exponential backoff on RequestException
• Add is_permanent_exception field to FailedTranscript for permanent failures
• Change tasks from list to dict mapping futures to video_ids for better error tracking
• Add thread-safe network warning mechanism to avoid duplicate warnings
• Improve error handling to capture video_id in exception handlers
• Handle RequestException separately from other exceptions

ytfetcher/_transcript_fetcher.py


3. ytfetcher/config/fetch_config.py ⚙️ Configuration changes +6/-0

Add recovery configuration options

• Add with_recovery boolean flag to enable/disable recovery retry pass
• Add recovery_delay integer field for configurable delay between retries

ytfetcher/config/fetch_config.py


View more (3)
4. ytfetcher/models/channel.py ✨ Enhancement +1/-0

Add permanent exception flag to model

• Add is_permanent_exception boolean field to FailedTranscript model
• Default value is False for backward compatibility

ytfetcher/models/channel.py


5. ytfetcher/utils/constants.py ✨ Enhancement +17/-5

Refactor constants for exception categorization

• Replace TRANSIENT_FAILURE_REASONS with PERMANENTLY_FAILED_EXCEPTIONS tuple
• Import specific exception types from youtube_transcript_api._errors
• Define permanent exceptions: TranscriptsDisabled, VideoUnavailable, InvalidVideoId,
 NoTranscriptFound
• Rename and refactor RETRYABLE_ERRORS to include TransientNetworkError

ytfetcher/utils/constants.py


6. pyproject.toml Dependencies +1/-0

Add tenacity dependency

• Add tenacity (>=9.1.4,<10.0.0) dependency for retry logic

pyproject.toml


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Apr 18, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Permanent failure not cached🐞 Bug ☼ Reliability
Description
_get_or_fetch_transcripts now caches only failures with is_permanent_exception=True, but
TranscriptFetcher._fetch_single returns FailedTranscript(reason="NoTranscriptFound") without setting
is_permanent_exception when the chosen fetch method returns no transcript. This causes permanently
transcript-less videos to be refetched on every run instead of being cached as failed.
Code

ytfetcher/_core.py[R333-334]

          # Any failures that are transient should not be marked as permanently failed in the cache, allowing for future retries.
-            permanent_failures = [f for f in new_failures if f.reason not in TRANSIENT_FAILURE_REASONS]
+            permanent_failures = [f for f in new_failures if f.is_permanent_exception]
Evidence
The cache write path now filters failures strictly by is_permanent_exception, but the "no transcript
found" return path in _fetch_single does not set that flag (so it remains the model default False).
The test suite also demonstrates a concrete scenario where _fetch_first_available_transcript can
return None, making the "if not transcript" branch reachable.

ytfetcher/_core.py[329-338]
ytfetcher/_transcript_fetcher.py[185-196]
ytfetcher/models/channel.py[39-44]
tests/test_transcript_fetcher.py[244-254]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ytfetcher/_core.py` now persists only failures where `FailedTranscript.is_permanent_exception == True`. However, `TranscriptFetcher._fetch_single()` returns `FailedTranscript(reason="NoTranscriptFound")` without setting `is_permanent_exception` when `transcript` is falsy (e.g., `None`). This leaves the flag at its default `False`, so these permanent failures are never written to the cache and will be re-fetched every run.
### Issue Context
- This behavior is reachable when `_fetch_first_available_transcript()` returns `None` (see existing unit test).
- `NoTranscriptFound` is intended to be treated as permanent (it’s included in `PERMANENTLY_FAILED_EXCEPTIONS`), but that only applies to the exception path, not the `if not transcript` path.
### Fix Focus Areas
- ytfetcher/_transcript_fetcher.py[185-196]
### Suggested fix
In the `if not transcript:` branch, set `is_permanent_exception=True` (or compute it consistently, e.g., treat `reason == "NoTranscriptFound"` as permanent). This will allow `_core.py` to cache these failures via its `permanent_failures` filter.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@what-the-diff
Copy link
Copy Markdown

what-the-diff Bot commented Apr 18, 2026

PR Summary

  • Introduction of a new dependency: We have added "tenacity" into our project under the file pyproject.toml to reinforce our system with retry logic, making our system more resilient in case of technical glitches.

  • Renaming to improve clarity: We've renamed the method _fetch_with_retry to _fetch_with_recovery_pass to better represent its functionality. This change more accurately underlines how the system can gracefully handle minor hiccups during operation.

  • Improvements in data retrieval: Our fetch logic now works smarter by identifying errors that may be corrected with retries and using an "exponential backoff strategy". This strategy means that our system will progressively increase the pauses between retries to balance between persistently trying to solve an issue and not overloading the system.

  • Customizable reactions to failures: We've updated FetchOptions and provided new settings like with_recovery and recovery_delay, granting you the power to set how the system reacts to minor glitches.

  • Enhanced error distinction: Our upgraded FailedTranscript now includes a feature is_permanent_exception to tell apart serious, permanent errors from temporary, recoverable ones, based on the types of encountered exceptions.

  • Inclusion of a timeout mechanism: We introduced a TimeoutSession class, setting a default timeout for network requests. This ensures our systems are not kept waiting indefinitely when any delays arise in network communication.

  • Improved communication through logging: Our system now gives users detailed logs, making transparency better by informing about network issues and any retries that are happening.

  • Better division between successful and failed results: We refactored our method _collect_results to be more discerning. Now the differentiation between successful and failed data fetches is clearer and the system handles retries for failed attempts more efficiently.

Comment thread ytfetcher/_core.py
@kaya70875 kaya70875 merged commit 47c4e73 into main Apr 19, 2026
2 checks passed
@kaya70875 kaya70875 deleted the fix/per-video-retry-backoff branch April 19, 2026 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant