Skip to content

Add probabilistic evaluation metrics (CRPS, rank histograms) via existing scores dependency #3

@GiGiKoneti

Description

@GiGiKoneti

Hi team,

While going through the downstream validation pipeline for the neural-lam
probabilistic forecasting track (issue mllam/neural-lam#62), I noticed
that mllam-verification currently only covers deterministic
statistics like rmse and mae.

pyproject.toml already pulls scores>=1.2.0 which exposes exactly
what's needed for ensemble evaluation:

  • scores.probability.crps_for_ensemble — for member-indexed ensemble
    outputs, which is the format neural-lam's datastore already uses
    consistently (ensemble_member dimension in weather_dataset.py
    and datastore/base.py)
  • scores.plotdata.rank_histogram — for Talagrand diagram evaluation
    of ensemble calibartion

Two concrete additions that would follow the existing architecture exactly:

  1. A crps() function in statistics.py wrapping
    scores.probability.crps_for_ensemble via compute_pipeline_statistic
    — same pattern as how rmse wraps scores.continuous.rmse
  2. A plot_rank_histogram() in plot.py wrapping
    scores.plotdata.rank_histogram — following the plot_single_metric_timeseries
    structure

No new dependancies needed. Both functions already exist in the
pinned scores version.

If this makes sense I'll go ahead and implement it otherwise just
let me know and I'll close the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions