-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
pi-26.2-objectiverepo:datafusion-contrib/arrow-zarrrepo:developmentseed/async-tiffrepo:developmentseed/obspecrepo:developmentseed/obspec-utilsrepo:developmentseed/obstorerepo:developmentseed/zarr-datafusion-searchrepo:geoarrow/geoarrow-rsrepo:virtual-zarr/virtual-tiffrepo:zarr-developers/virtualizarr
Description
Why us?
We have developed many of the underlying technologies for virtualization, led numerous workshops and presentations on the concept, and are well-positioned to collaborate across NASA to propose a unified strategy.
Why now?
Cloud-native access to L2 data is particularly difficult. These challenges will be compounded by the influx of data from the NISAR project. We aim to develop a strategy to solve these issues.
Milestone
- Sprint 1: Identify use cases, design benchmarking plan
- Sprint 2 + 3: Execute benchmarking
- Sprint 4: Completion of unified strategy document
Acceptance Criteria
- Produce a diagram or ADR of what problems/use cases this is trying to solve
- Benchmarking of at least 1 approach
- Stretch: Production of a unified strategy document for virtualization and metadata querying of L2 data
Tasks
- Groundwork phase (sprint 1)
- Identify use-cases for new L2 solutions - e.g., WorldView, MAAP?, ISMIP6/7, Project focused virtualization
- Produce a design document for benchmarking scalability across multiple metadata management schemes, including zarr-datafusion-search, stac-geoparquet, and icechunk (e.g., time to write, time to query, time to query with different filters)
- Proposed: Generate a demonstration Icechunk store containing a metadata schema with 5 or more "columns" (with one of them being geometry) and 1 million entries for HLS data stored in a bucket in us-west-2.
- Evaluation phase (sprint 2-3)
- Implement and run benchmarks for comparison to other data and metadata management options
- Reporting phase (sprint 4)
- Collaborate on a unified strategy document for metadata management for L2 data
- Ongoing
- Prepare Zarr V3 standard to the ESCO office to enable virtualization of L2 products
- Stretch:
- Investigate visualization approaches for virtualized L2 data
- Incrementally add user required functionality such as spatial search and schema projections.
- Prepare Icechunk standard for the ESCO office to enable virtualization of L2 products
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
pi-26.2-objectiverepo:datafusion-contrib/arrow-zarrrepo:developmentseed/async-tiffrepo:developmentseed/obspecrepo:developmentseed/obspec-utilsrepo:developmentseed/obstorerepo:developmentseed/zarr-datafusion-searchrepo:geoarrow/geoarrow-rsrepo:virtual-zarr/virtual-tiffrepo:zarr-developers/virtualizarr