Skip to content

Investigate JohnComonitski/SoccerAPI as supplementary data source #2

@rahulkeerthi

Description

@rahulkeerthi

Summary

JohnComonitski/SoccerAPI claims 82,000+ players mapped across API-Football, FBref, Transfermarkt, and Understat. Investigate whether it's worth extracting their mapping data.

What it offers

  • 82K players, 4.2K teams, 200 leagues
  • Cross-maps: API-Football ↔ FBref ↔ Transfermarkt ↔ Understat
  • MIT licensed

Current assessment: low priority

Why we haven't included it:

  1. Not a downloadable dataset. It's a Python framework that builds its own Postgres database by scraping providers. There's no CSV/JSON dump to download -- you need to set up Postgres, install the package, and run their scraping pipeline to populate the DB.

  2. We already cover these providers. The four providers it maps (API-Football, FBref, Transfermarkt, Understat) are all in our register via:

    • Wikidata: Transfermarkt (209K), FBref (106K)
    • FPL-ID-Map: Understat (2.3K), WhoScored (2.4K)
    • Custom matching: API-Football (101)
  3. Scale overlap. Wikidata gives us 386K players globally. SoccerAPI's 82K would mostly be a subset of what we already have.

  4. Infrastructure cost. Setting up their Postgres pipeline just to extract a mapping table is disproportionate effort.

When it might be worth revisiting

  • If they publish a static dump (CSV/SQLite) of just the ID mapping table
  • If we find significant gaps in our API-Football or Understat coverage for non-PL leagues
  • If someone extracts and shares the mapping table independently

Action

Defer unless circumstances change. Check back periodically for static data exports.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions