Summary
Expand the register beyond players/teams/coaches to include competitions and matches as first-class entity types.
Motivation
Wikidata has cross-provider IDs for competitions and matches that would be valuable for data joining:
| Property |
Provider |
Count |
| P7455 |
Transfermarkt match ID |
26,624 |
| P13664 |
FBref competition ID |
148 |
| P12758 |
Transfermarkt competition ID |
13 |
| P8735 |
Opta competition ID |
53 |
| P7460 |
Flashscore match ID |
22 |
| P13665 |
FBref match ID |
4 |
The Transfermarkt match ID coverage (26K) is particularly useful -- it would let users resolve match IDs across providers the same way they resolve player IDs today.
What needs to change
Schema
- Add
competition and match to the type CHECK constraint on entities
- Competition metadata: country, tier/level, format (league/cup)
- Match metadata: date, home_team_qid, away_team_qid, competition_qid
SPARQL extraction
- New queries for competition entities (Q500834 = association football league, etc.)
- New queries for match entities (if feasible at scale -- 26K is manageable)
CSV export
- New
data/competitions.csv and optionally data/matches.csv
- Or add to existing tables with appropriate type column
API
- Existing endpoints should work (search/resolve/lookup already filter by type)
- May want
/stats to show competition/match counts
Open questions
- Are match entities useful enough to justify the schema complexity?
- Should competitions include historical/defunct leagues?
- How to handle competition seasons (PL 2024/25 vs PL as a concept)?
Priority
Medium -- the current player/team/coach coverage is the core value. Competitions would be a nice-to-have for power users.
Summary
Expand the register beyond players/teams/coaches to include competitions and matches as first-class entity types.
Motivation
Wikidata has cross-provider IDs for competitions and matches that would be valuable for data joining:
The Transfermarkt match ID coverage (26K) is particularly useful -- it would let users resolve match IDs across providers the same way they resolve player IDs today.
What needs to change
Schema
competitionandmatchto thetypeCHECK constraint onentitiesSPARQL extraction
CSV export
data/competitions.csvand optionallydata/matches.csvAPI
/statsto show competition/match countsOpen questions
Priority
Medium -- the current player/team/coach coverage is the core value. Competitions would be a nice-to-have for power users.