Skip to content

ODD PI 26.2 Objective 5: 🤪 Expand virtualization support for quirky datasets #308

@abarciauskas-bgse

Description

@abarciauskas-bgse

Why us?

We are uniquely positioned to deliver this objective as core developers of Zarr-Python, VirtualiZarr, and titiler.

Why now?

  • Many NASA datasets have non-uniform rectilinear chunk grids, which are not yet supported by virtualization technologies. We built a prototype at the Zarr Summit but need to push it over the finish line.
  • NASA data producers want to produce Icechunk stores using DMR++ files as a starting point, which is not always possible because many DMR++ files have “inlined” variables. We want to unblock this distribution strategy. As of August 2025, 15% of DMR++ files can be opened by virtualizarr.

Milestone

External dataset support

Acceptance Criteria

Rectilinear Chunk Grid Support

  • Support release of rectilinear chunk grid support in Zarr-Python via code reviews
  • Implement support of rectilinear chunk grids in VirtualiZarr
  • Demonstrate rectilinear chunk grid virtualization using a NASA dataset

DMR++ Support

  • Increase proportion of DMR++ files that can be virtualized to Zarr by implement support for producing Icechunk stores from DMR++ files and Kerchunk files with “inlined variables” in VirtualiZarr
  • Demonstrate approach using a collection with DMR++ files from GES DISC

LOE: 0.5 FTE for 2 sprints

Upstream issues / PRs