Blog

Blog

May 6, 2025 | 4 min read
Joe Hamman

Joe Hamman

Co-founder and CTO

Zarr takes Cloud-Native Geospatial by storm

Our takeaways from the Cloud-Native Geospatial conference on Zarr’s surging adoption and its impact on the future of Earth Observation data.

Our team just returned from an action-packed week at the Cloud-Native Geospatial conference in beautiful Snowbird, Utah, and the key takeaway was unmistakable: Zarr adoption is surging. It’s no longer confined to weather and climate data; organizations of all sizes are increasingly choosing Zarr for a diverse array of geospatial data products. The collective enthusiasm for Zarr was palpable, and frankly, its rapid momentum was surprising even to us.

What We Heard: Zarr’s Expanding Footprint

The conference began with a presentation by our own engineers, Lindsey Nield and Deepak Cherian, titled “Zarr for Cloud-native Geospatial: When and Why?” While they initially positioned Zarr as ideal for Level 3+ Earth Observation (EO) products and COGs as suitable for most Level 1 & 2 data, the week’s discussions painted a more nuanced picture. It became clear that Zarr’s utility is broadening, with significant adoption for Level 1 and 2 EO applications as well. Here are some highlights:

  • Sentinel 1/2/3: The new Copernicus Earth Observation Processing Framework (EOPF) will leverage Zarr for generating Level 1 and 2 datasets. This pivotal move will eventually make hundreds of petabytes of Sentinel data available in Zarr format. More details can be found here.
  • Landsat: Similarly, the USGS EROS is actively exploring Zarr for the extensive Landsat archive. USGS shared preliminary findings from a study evaluating Zarr as the next generation format for Landsat data processing, product archiving, and metadata management at scale. USGS plans to archive over 200 Petabytes of Landsat data products in the cloud over the next 15 years.
  • Google Earth Engine: We also learned that Google Earth Engine is developing Zarr support. This enhancement will greatly simplify data import and export for Earth Engine users, fostering better interoperability with the vast amounts of Zarr data already in cloud storage.
  • ESRI ArcGIS: Zarr support has recently landed in ESRI’s ArcGIS Pro product. This product, which has been a mainstay of desktop GIS for decades, now supports adding “Multidimensional raster data” layers in the form of Zarr stores.
  • Hyperspectral Remote Sensing: The “hallway track” was abuzz with how Zarr can significantly benefit hyperspectral remote sensing data, which is often poorly supported by COGs and traditional formats. Zarr seamlessly handles the multi-band nature of hyperspectral data, a common challenge for COGs.
  • Zarr + STAC: The synergy between Zarr and STAC (SpatioTemporal Asset Catalogs) was another prominent theme. STAC, widely adopted for defining and searching geospatial data, perfectly complements Zarr’s strengths in storing cloud-optimized array data. This combination is proving powerful, especially within the Earth modeling community, championed by Pangeo. Julia Signell’s insightful blog post here offers a deeper dive.

Zarr vs COG

Several talks directly compared Zarr to COG. In the “Performance and Scale” session Zachariah Dicus showed the results of benchmarking Zarr vs COG at USGS, and Jeff Albrecht highlighted challenges scaling a COG-centric tile-serving architecture. Floodbase gave us a feature wish-list in “Why we don’t use Zarr (yet)”, and the birds-of-a-feather session on visualizing Zarr highlighted work to be done to ensure compatibility with all geospatial tooling.

Zarr 🤝 COG

Tom Nicholas’ talk on VirtualiZarr (slides here) argued that this divide could be bridged – “virtualizing” binary file formats such as netCDF and COG into Icechunk stores would allow accessing pre-existing data as if it were Zarr. Again he emphasised the theme of treating Level 3 data as a single cloud-optimized datacube, just this time as a layer over existing data.

Earthmover’s Workshop: Hands-On with Zarr

On the second day, our Earthmover team hosted a 3-hour workshop, “Zarr, Icechunk, & Xarray for Cloud-native Geospatial Data-cube Analysis.” We were delighted to welcome nearly 50 attendees to explore:

  • Core concepts of using data cubes for organizing and analyzing Level 3 data.
  • Techniques for performing zonal statistics on global-scale raster data cubes.
  • The power of virtualizing archival file formats using Virtualizarr and Icechunk.

The workshop materials are now available here. We welcome your feedback.

Conclusion: The Future is Cloud-Optimized and Zarr-Powered

While the CNG conference showcased a wealth of inspiring talks on how our community is leveraging technology to address real-world challenges, it undeniably highlighted Zarr’s maturation from an emerging technology to a rapidly adopted solution within the geospatial domain. From foundational Level 1 data to extensive Level 3+ products, and across diverse applications like hyperspectral imaging, Zarr is demonstrating its versatility, efficiency, and performance. The commitment from major entities like Copernicus, USGS, and Google Earth Engine signals a paradigm shift in how we will store, access, and analyze geospatial data in the cloud. As we navigate a future increasingly reliant on massive volumes of EO data, Zarr is poised to become a cornerstone of the cloud-native geospatial ecosystem, unlocking new frontiers for scientific discovery and operational applications.

However, to fully harness this momentum, crucial work remains. Enhancing compatibility with established tools like GDAL and ensuring seamless integration into existing GIS workflows are vital for broader adoption. Continued development and refinement of the GeoZarr specification are essential for interoperability and standardization across the community. Furthermore, addressing challenges related to multiscale datasets and the efficient generation of overviews will be key to unlocking Zarr’s full potential for interactive visualization and analysis. Zarr is fast becoming the backbone of cloud-native multi-dimensional geospatial workflows—bringing analysis-ready, scalable, and interoperable data within reach, even as the community continues to build toward that future.