Virtual Zarr

HomeUse CasesTeamNews

Cloud-optimize your scientific data
without copying it

Virtual Zarr enables performant, cloud-optimized access to archival data formats like netCDF and HDF5 — without duplicating any data.

Get StartedSee Use Cases

The Ecosystem

Three powerful tools working together to bring cloud-native workflows to your existing data archives.

VirtualiZarr

Create virtual Zarr stores from archival data formats using a familiar xarray API. Supports netCDF4, HDF5, FITS, and more.

Documentation GitHub

Icechunk

A transactional storage engine for Zarr. Commit virtual references with version control, time travel, and distributed writes.

earthaccess

Search, download, or stream NASA Earth science data with just a few lines of code. Seamlessly integrates with virtual datacube workflows.

Virtual Zarr Pathways - showing how VirtualiZarr, Icechunk, and earthaccess work together

Why Virtual Zarr?

Unlock cloud-native performance for your legacy scientific data without the hassle of data migration.

Faster Processing

Analyze a year of TEMPO data in 10 minutes instead of hours. Virtual references enable efficient parallel access.

No Data Duplication

Create virtual datacubes that reference existing files. No need to copy or convert terabytes of archival data.

Works with Archives

Access netCDF, HDF5, and other legacy formats as if they were cloud-optimized Zarr stores.

Familiar Workflow

Use the xarray and Python tools you already know. Virtual Zarr integrates seamlessly with your existing code.

Standing on the Shoulders of Giants

Virtual Zarr builds on decades of work in scientific data formats, remote data access, and computer science fundamentals.

OPeNDAP

Pioneered remote data access and the DMR++ metadata format

HDF Group

Chunk-level access and the foundations of scientific data storage

Kerchunk

Originated the concept of virtual Zarr references

fsspec

Python filesystem abstraction enabling cloud-native access

Full history coming March 2026

Contributors

Virtual Zarr is made possible by ASDC, ASF, CarbonPlan, Development Seed, Earthmover, GES DISC, LP DAAC, NASA Earthdata, NSIDC, OB.DAAC, Openscapes, ORNL DAAC, PO.DAAC, and the Data Systems Evolution team at NASA Marshall Space Flight Center.