Skip to content

ODD PI 26.2 Objective 6: 🛰️ Propose unified strategy for virtualization of orbital swath (“L2”) data #309

@abarciauskas-bgse

Description

@abarciauskas-bgse

Why us?

We have developed many of the underlying technologies for virtualization, led numerous workshops and presentations on the concept, and are well-positioned to collaborate across NASA to propose a unified strategy.

Why now?

Cloud-native access to L2 data is particularly difficult. These challenges will be compounded by the influx of data from the NISAR project. We aim to develop a strategy to solve these issues.

Milestone

  • Sprint 1: Identify use cases, design benchmarking plan
  • Sprint 2 + 3: Execute benchmarking
  • Sprint 4: Completion of unified strategy document

Acceptance Criteria

  • Produce a diagram or ADR of what problems/use cases this is trying to solve
  • Benchmarking of at least 1 approach
  • Stretch: Production of a unified strategy document for virtualization and metadata querying of L2 data

Tasks

  • Groundwork phase (sprint 1)
    • Identify use-cases for new L2 solutions - e.g., WorldView, MAAP?, ISMIP6/7, Project focused virtualization
    • Produce a design document for benchmarking scalability across multiple metadata management schemes, including zarr-datafusion-search, stac-geoparquet, and icechunk (e.g., time to write, time to query, time to query with different filters)
      • Proposed: Generate a demonstration Icechunk store containing a metadata schema with 5 or more "columns" (with one of them being geometry) and 1 million entries for HLS data stored in a bucket in us-west-2.
  • Evaluation phase (sprint 2-3)
    • Implement and run benchmarks for comparison to other data and metadata management options
  • Reporting phase (sprint 4)
    • Collaborate on a unified strategy document for metadata management for L2 data
  • Ongoing
    • Prepare Zarr V3 standard to the ESCO office to enable virtualization of L2 products
  • Stretch:
    • Investigate visualization approaches for virtualized L2 data
    • Incrementally add user required functionality such as spatial search and schema projections.
    • Prepare Icechunk standard for the ESCO office to enable virtualization of L2 products