-
Notifications
You must be signed in to change notification settings - Fork 2k
Type musicbrainz #6329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Type musicbrainz #6329
Conversation
This way we only ever handle full Recording objects.
|
Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces comprehensive typing for MusicBrainz API payloads through TypedDict models, refactors parsing logic into smaller helpers, normalizes API responses at the boundary (converting dashes to underscores and grouping relations), and replaces hand-written test dictionaries with factory-based data generation.
Changes:
- Introduced typed domain models (
Release,Recording,Medium, etc.) inbeetsplug/_utils/musicbrainz.py - Normalized MusicBrainz API responses by converting dashed keys to underscores and grouping relations
- Refactored parsing into focused helper methods (
_parse_artist_credits,_parse_release_group, etc.)
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
test/rsrc/mbpseudo/pseudo_release.json |
Updated test fixture to use underscored keys matching normalized payload format |
test/rsrc/mbpseudo/official_release.json |
Updated test fixture to use underscored keys matching normalized payload format |
test/plugins/utils/test_musicbrainz.py |
Renamed test from test_group_relations to test_normalize_data reflecting the renamed internal method |
test/plugins/test_musicbrainz.py |
Replaced manual test data construction with factory-based approach and updated assertions to match factory defaults |
test/plugins/test_mbpseudo.py |
Updated references from dashed keys to underscored keys |
test/plugins/factories/musicbrainz.py |
Added factory classes for generating typed MusicBrainz test data |
setup.cfg |
Enabled strict mypy checking for musicbrainz-related modules |
pyproject.toml |
Added pytest-factoryboy test dependency |
beetsplug/musicbrainz.py |
Refactored parsing logic into typed helper methods and updated to consume normalized payloads |
beetsplug/mbpseudo.py |
Updated to use typed Release structures and underscored keys |
beetsplug/_utils/musicbrainz.py |
Added comprehensive TypedDict models and renamed/enhanced _group_relations to _normalize_data |
Comments suppressed due to low confidence (2)
beetsplug/_utils/musicbrainz.py:1
- The length checks are redundant. If
map(int, date_str.split('-'))produces fewer than three parts, accessingparts[1]orparts[2]will raise anIndexErrorrather than returningNone. The current implementation doesn't actually prevent the error. Consider usingparts[i] if i < len(parts) else Noneor a more concise approach like padding the list.
"""Helpers for communicating with the MusicBrainz webservice.
beetsplug/musicbrainz.py:1
- Corrected spelling of 'attribute_credits' to 'attribute_values' to match the actual field name in ArtistRelation TypedDict.
# This file is part of beets.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "Recording name": "5d564c8f-97de-3572-94bb-7f40ad661499", | ||
| "Release group name": "156e24ca-8746-3cfc-99ae-0a867c765c67", | ||
| "Release name": "df187855-059b-3514-9d5e-d240de0b4228", | ||
| "Search hint": "abc2db8a-7386-354d-82f4-252c0213cbe4", |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected 'Search hint' type_id which appears to be a typo (should match the type_id pattern used elsewhere, likely '829662f2-a781-3ec8-8b46-fbcea6196f81' based on the JSON fixtures).
| "Search hint": "abc2db8a-7386-354d-82f4-252c0213cbe4", | |
| "Search hint": "829662f2-a781-3ec8-8b46-fbcea6196f81", |
| if not date_str: | ||
| return None, None, None | ||
|
|
||
| parts = list(map(int, date_str.split("-"))) |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function _get_date attempts to convert all date parts to integers without error handling. If the date string contains non-numeric values (e.g., malformed dates from the API), this will raise a ValueError. Consider wrapping the conversion in a try-except block or validating the input first.
| parts = list(map(int, date_str.split("-"))) | |
| raw_parts = date_str.split("-") | |
| parts: list[int | None] = [] | |
| for raw in raw_parts[:3]: | |
| try: | |
| parts.append(int(raw)) | |
| except ValueError: | |
| parts.append(None) |
❌ 8 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
|
I also did a similar sort of refactoring recently, and this is what I ended up with: https://github.com/prTopi/beets-vocadb/tree/experimental/beetsplug/vocadb/vocadb_api_client Feel free to take inpiration! Here are some suggestions:
|
PR Summary: Typed MusicBrainz payloads + parsing refactor + factory-based tests
Why this change exists
The MusicBrainz integration was historically treated as loosely-typed JSON, which made it easy to:
'release-group','artist-credit') into downstream codeThis PR makes MusicBrainz payloads first-class typed models (via
TypedDict), normalizes API payload shape consistently, and refactors parsing into smaller, reusable units. Tests are updated to usefactory_boyfactories to generate payloads that match the typed contract.High-level architecture changes
1) Introduce a typed domain model for MusicBrainz payloads
beetsplug/_utils/musicbrainz.pynow defines a comprehensive set ofTypedDictmodels:Release,Recording,Medium,Track,ArtistCredit, etc.Impact
MusicBrainzAPI.get_release()/get_recording()return typedRelease/Recordinginstead of untypedJSONDict.beetsplug/musicbrainz.pyandbeetsplug/mbpseudo.py) are updated to accept and operate on these typed payloads.This turns the MB payload into a stable internal contract and lets mypy enforce correctness in parsing code.
2) Normalize MusicBrainz API responses at the boundary
The API wrapper normalizes two key aspects:
text-representation→text_representation)'relations'into typed buckets likeartist_relations,url_relations,work_relationsConceptually:
Impact
mbpseudointerception logic is simplified because it no longer needs to handle dashed keys.3) Parsing refactor: smaller helpers, clearer responsibilities
beetsplug/musicbrainz.pymoves from one largealbum_info()routine with repeated inline logic to a more structured approach:MusicBrainzPlugin._parse_artist_credits(...)centralizes "artist-credit flattening" into a single helper that outputs both:artist,artist_credit,artist_sort)artists,artists_credit,artists_sort, plusartist_id/artists_ids)_parse_release_group(...),_parse_label_infos(...),_parse_external_ids(...),_parse_genre(...)each encapsulate a single extraction responsibility and return dict fragments that are merged intoAlbumInfo.Medium/track parsing is split out into
get_tracks_from_medium(...), and config access is cached (ignored_media,ignore_data_tracks,ignore_video_tracks).Before (simplified)
album_info()handled:info.mediaAfter
album_info()orchestrates:track_infos.extend(get_tracks_from_medium(medium))TrackInfo.indexin a single passmediafrom track infos (single value vs'Media')This is effectively a "pipeline":
Release→Medium→TrackInfo+ album metadata.4) Tests re-architected around factories (stable data-shape contract)
A new test factory module is introduced:
test/plugins/factories/musicbrainz.pyusingfactory_boy.Key changes:
RecordingFactory,TrackFactory,MediumFactory,ReleaseFactory, etc.ReleaseFactoryis introduced and then made the default basis for test releases; it provides realistic defaults (release events, label info, text representation, ids, etc.).@factory.post_generationhook that setstrack['position']sequentially.Impact
Release/Mediumfields, factories provide them.Reviewer guide: what to focus on
Public/semantic behavior
AlbumInfo.mediais now computed from track infos, not directly fromrelease['media'](still yields the same'Media'vs single-format behavior, but viaTrackInfo.media).Boundary normalization correctness
MusicBrainzAPI._normalize_data()is the foundation: if it regresses, everything downstream breaks in subtle ways.Medium/track parsing extraction
get_tracks_from_medium()now owns pregap/data track inclusion and filtering. Ensure the config semantics match previous behavior.Test changes are mostly structural
'Album'vs'ALBUM TITLE', deterministic UUID-like ids).assert albumadditions fix typing concerns whenalbum_for_id()could returnNone.High-level impact
beetsplug.musicbrainz,beetsplug.mbpseudo, andbeetsplug._utils.Medium→TrackInfopipeline.pytest-factoryboy(and transitivefactory_boy,faker,inflection) added for test data generation.