Blog

Blog

October 10, 2025 | 5 min read
Ryan Abernathey

Ryan Abernathey

CEO and Cofounder

Building the Future of Scientific Data at the Zarr Summit

Next week we are convening the Zarr community in Rome, Italy for a week of fast-paced collaboration and conversation. Given the recent acceleration on Zarr adoption across major data providers in Weather Forecasting, Earth Observation and Bioimaging, this in-person event is critical for aligning stakeholders and defining the roadmap for the next phase of Zarr development. The Zarr Summit is jointly organized by Earthmover, Development Seed, and the Cloud Native Geospatial Forum, with foundational support from the Navigation Fund.

Background: What is Zarr

We recently published a whole blog post about this! But let’s recap.

Zarr is a simple but powerful open-source, cloud-native protocol for storing chunked, compressed N-dimensional arrays. Designed for performance, interoperability, and the cloud, Zarr is quietly transforming the way researchers and developers handle large datasets. Think of it as a lightweight, modular alternative to older formats like HDF5, designed to work especially well with cloud storage and large-scale computing.

Zarr is meant for storing large ND-arrays or tensors, rather than tables. Most popular cloud data platforms, like Snowflake and Databricks, are built for tables. In our blog post Tensors vs Tables, we explain why tables are often a poor fit for scientific data from sensors, satellites, simulations, and microscopes. For these applications, we need arrays.

Zarr is an open-source community-developed and governed project. No one company or organization owns or controls Zarr, and decisions are made in a decentralized way. Because it’s a file format (rather than just a software package), the decisions made by the Zarr community last a long time! Data written today need to be readable a decade from now. As a growing number of large organizations–from Google and NVIDIA to ESA and NASA–are depending on Zarr, the challenge of balancing different organizations’ requirements are growing.

As an example of this adoption, ESA recently announced that they are transitioning all of the Sentinel Missions to produce Zarr as the primary data format. This means that petabytes critical EO data, and all of the applications that rely on this data, will now depend on Zarr.

It’s time to get stakeholders off the internet into the same room face-to-face!

Zarr Summit: Goals and Structure

The summit consists of two parts: a developer summit and an adopter summit.

At the developer summit, we’ll be sprinting to advance some key features of Zarr, including

  • Irregular chunk grids – more flexible, non-uniform chunking strategies
  • Sharding – packing multiple chunks into single files / shards
  • New codecs – supporting the most advanced compression strategies
  • New dtypes – including low-precision floats used in AI as well as geometry types
  • Cross language interoperability – verifying compatibility across Python, C++, Rust, Java, and JavaScript

We’re expecting to finish the week with some major progress on key technical issues which will help accelerate Zarr into the future.

At the adopter summit, we’ll be featuring

  • Roundtables – opportunity for adopters to meet with implementers
  • Migration Guidance – walking through the Zarr V3 adoption process and addressing implementation gaps
  • Virtual Zarr Guidance – unlocking cloud-native data access for massive archival datasets
  • Real-world Use Cases – sharing challenges and solutions from production deployments

Huge shout out to Max Jones of Development Seed for doing lots of the heavy lifting on the agenda and logistics!

Earthmover’s Contributions to Zarr and the Zarr Summit

Earthmover plays a leading role in the Zarr ecosystem. Our founders Ryan and Joe built the original integration between Xarray and Zarr, which unlocked Zarr as a format for geospatial data. Today Ryan sits on the Zarr steering council, and our team drives a large fraction of Zarr Python development.

In addition to co-organizing the summit, we’re sending Ryan, Joe, Seba, and Tom to participate. Here’s a rundown of some of the presentations you can expect from our team at the Adopter Summit:

  • Ryan Abernathey – Keynote: What we need from Zarr 4 
  • Joe Hamman – How to make Zarr go zoom zoom!
  • Sebastian Galkin – Icechunk, or how to put your Zarr data in production
  • Tom Nicholas – VirtualiZarr: A bridge from archival file formats to Zarr

We’re proud to be driving this critical project forward, in partnership with other awesome organizations like Development Seed, B-Open, and Scalable Minds.

We’re also sponsoring a happy hour and networking reception on Wednesday, Oct. 15. Sign up here if you’re interested in joining!

We’ll also be live posting updates from the summit. Be sure to follow Earthmover on LinkedIn to stay up to date on the latest Zarr developments.

Zarr and the Earthmover Platform

Earthmover is the cloud-native platform for weather, climate, and geospatial data. Built by the maintainers of the popular open-source Xarray, Zarr, and Icechunk packages, Earthmover’s data model based on tensors (rather than tables) is the natural fit for weather forecasts, climate models, and hyperspectral Earth-observation data. With cloud-optimized tensor storage, advanced data catalog and governance, high-performance data loaders, and geospatial API integrations, Earthmover supports customers in energy, insurance, finance, agriculture, and environmental monitoring to develop and gain insights from data and operationalize state-of-the-art AI models for prediction and decision making.

Earthmover’s commitment to open source means that our platform respects your data sovereignty: your data reside in your cloud object storage in an open source format while our platform layers services on top. Our customers also have access to solutions engineers who are the world’s leading experts in Zarr, Xarray, and cloud-native data systems. 

Interested in learning more? Get started with Earthmover today!