Announcing the Earthmover Data Marketplace: Subscribe to ARCO datasets from ECMWF, NOAA, and more. Explore the marketplace .

Blog

Articles, announcements, and case studies from the Earthmover team.

From GeoTIFF Chaos to Cloud-Native Climate Risk: How Eoliann Built Airis on Earthmover

From GeoTIFF Chaos to Cloud-Native Climate Risk: How Eoliann Built Airis on Earthmover

April 13, 2026

The Challenge Eoliann builds proprietary climate risk models that estimate the physical impact of extreme weather events — floods, wildfires, and storms — on critical physical infrastructure: electricity transmission lines, substations, and gas pipelines. Their customers include some of Europe's lar…

Margaret Francis

Margaret Francis

COO

Announcing Icechunk 2: Better Consistency, Performance, and Reliability for Tensor Storage

Announcing Icechunk 2: Better Consistency, Performance, and Reliability for Tensor Storage

April 9, 2026

When we released Icechunk 1.0 last July, we declared it production-ready and committed to format stability. Since then, adoption has exceeded our expectations. Teams across weather forecasting, climate science, neuroscience, and AI/ML have pushed Icechunk into scenarios we didn't fully anticipate--r…

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Sebastian Galkin

Sebastian Galkin

Staff Engineer

Expanding the Earthmover Data Marketplace: Sylvera, Spire, Eagle Rock and Carbonplan

Expanding the Earthmover Data Marketplace: Sylvera, Spire, Eagle Rock and Carbonplan

April 7, 2026

When we launched the Earthmover Data Marketplace in January, we described it as just the beginning. Today, we're showing that growth. We're excited to announce four new data partners joining the marketplace: Sylvera, Spire, Eagle Rock Analytics, and CarbonPlan. This second cohort adds proprietary da…

Margaret Francis

Margaret Francis

COO

Announcing IceChunkCoin: The World's First Multi-Dimensional, Chunk-Native Blockchain Asset

Announcing IceChunkCoin: The World's First Multi-Dimensional, Chunk-Native Blockchain Asset

April 1, 2026

Today we’re thrilled to announce IceChunkCoin, the world’s first hyper-dimensional, chunk-native blockchain asset.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Cursed venvs, Confident Releases: Testing Icechunk Across Major Versions

Cursed venvs, Confident Releases: Testing Icechunk Across Major Versions

March 20, 2026

Earthmover built third-wheel, an open-source tool that rewrites Python wheels to install multiple versions of a library in one environment, enabling cross-version compatibility testing for the Icechunk V2 release.

Ian Hunt-Isaak

Ian Hunt-Isaak

Xarray Community Developer

Ditch the Data Pipeline: A Snow Alert Bot in an Afternoon

Ditch the Data Pipeline: A Snow Alert Bot in an Afternoon

February 18, 2026

How we built a Slack bot that alerts our team when it's snowing at their location, using Earthmover's Marketplace and Flux APIs to skip the data pipeline entirely.

Ian Hunt-Isaak

Ian Hunt-Isaak

Xarray Community Developer

Matt Iannucci

Matt Iannucci

Engineering

Earthmover Selected to Power ARIA’s “Forecasting Tipping Points” Simulation Catalogue

Earthmover Selected to Power ARIA’s “Forecasting Tipping Points” Simulation Catalogue

February 9, 2026

Earthmover has been selected by ARIA to provide the Simulation Catalogue for the Forecasting Tipping Points programme, enabling 26 research teams to share and analyze petabyte-scale climate data.

Joe Hamman

Joe Hamman

CTO & Co-founder

Announcing the Earthmover Data Marketplace

Announcing the Earthmover Data Marketplace

January 22, 2026

Earthmover launches the world's first marketplace for AI-ready weather and climate data, offering instant access to analysis-ready cloud-optimized data cubes from leading providers in the open-source Icechunk format.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Evolving our Tensor Storage Engine: A Preview of Icechunk 2

Evolving our Tensor Storage Engine: A Preview of Icechunk 2

December 2, 2025

A preview of Icechunk 2, featuring node rename, chunk reindexing, rectilinear grids, repository-level metadata, and significant performance improvements with a smooth migration path from Icechunk 1.

Sebastian Galkin

Sebastian Galkin

Staff Engineer

I/O-Maxing Tensors in the Cloud

I/O-Maxing Tensors in the Cloud

November 25, 2025

Zarr Python with Icechunk or Obstore now fully saturates the network between EC2 and S3, achieving the physically maximum possible throughput for reading and writing tensor data in the cloud. Benchmarks compare Zarr, Tensorstore, TileDB, and Parquet stacks across a range of chunk sizes and instance types.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Scientific Data Visualization with Xarray and Napari

Scientific Data Visualization with Xarray and Napari

October 29, 2025

A roadmap for integrating Xarray and napari to deliver named-dimension-aware, metadata-rich scientific data visualization across biology and geosciences.

Ian Hunt-Isaak

Ian Hunt-Isaak

Xarray Community Developer

Building the Future of Scientific Data at the Zarr Summit

Building the Future of Scientific Data at the Zarr Summit

October 10, 2025

Earthmover co-organizes the Zarr Summit in Rome, bringing together developers and adopters to advance the open-source cloud-native array format as adoption accelerates across major organizations like ESA, NASA, and NVIDIA.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

From 10 Minutes to 10 Seconds: How Woods Hole Scientists used Icechunk to Optimize Ocean Data Access

From 10 Minutes to 10 Seconds: How Woods Hole Scientists used Icechunk to Optimize Ocean Data Access

October 8, 2025

Woods Hole scientists reduced ocean profile data access from 10 minutes to 10 seconds by converting their OPeNDAP-served NetCDF files to Icechunk repositories on AWS S3.

Iury Simoes-Sousa

Iury Simoes-Sousa

Postdoctoral Investigator, WHOI

Wicked smaht dynamic map tile rendering of Icechunk/Zarr data with xpublish-tiles

September 29, 2025

Earthmover is launching a new open-source library xpublish-tiles that powers our new Flux Tiles service, which allows Earthmover Platform users to view their data on a slippy map with dynamically rendered tiles at lower zoom levels than was possible previously.

Deepak Cherian

Deepak Cherian

Forward Deployed Engineer

Matt Iannucci

Matt Iannucci

Engineering

Plotting NYC heatwaves during NYC Climate Week

Plotting NYC heatwaves during NYC Climate Week

September 26, 2025

A hands-on walkthrough of calculating historical heatwave frequency over NYC using ERA5 reanalysis data on the Earthmover platform with Arraylake, Icechunk, Xarray, and open-source climate tools.

Tom Nicholas

Tom Nicholas

Software Engineer

Earthmover’s $7.2M Seed Round led by Lowercarbon Capital

Earthmover’s $7.2M Seed Round led by Lowercarbon Capital

September 22, 2025

Earthmover announces its $7.2M seed round led by Lowercarbon Capital, with participation from Costanoa Ventures and Preston-Werner Ventures, to build the cloud-native data platform for weather, climate, and scientific data.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

The 3 Key Optimizations That Cut the Cost of AI Weather Forecasts by 90%

The 3 Key Optimizations That Cut the Cost of AI Weather Forecasts by 90%

August 27, 2025

GPUs running AI weather forecasts spend over 95% of their time idle, waiting for data. Three optimizations — pre-processing inputs into Icechunk, moving regridding onto the GPU, and writing outputs in parallel — cut inference costs by nearly 90%.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Multi-Player Mode: Why Teams That Use Zarr Need Icechunk

Multi-Player Mode: Why Teams That Use Zarr Need Icechunk

August 25, 2025

Zarr lacks built-in support for concurrent readers and writers, leading to inconsistent reads and conflicting writes in team settings. Icechunk solves this by adding atomic updates, consistent snapshots, and Git-like version control on top of Zarr.

Lindsey Nield

Lindsey Nield

Software Engineer

Radar DataTree: Transforming thousands of scans into a single cohesive model

Radar DataTree: Transforming thousands of scans into a single cohesive model

August 19, 2025

Introducing the Radar DataTree, a new data model that organizes thousands of fragmented weather radar scans into a single time-aware, cloud-native, version-controlled dataset using xarray-datatree, Zarr, and Icechunk.

Alfonso Ladino-Rincon

Alfonso Ladino-Rincon

Data Scientist

Earthmover Sponsors Ocean Hack Week: Empowering the Open Science Community

Earthmover Sponsors Ocean Hack Week: Empowering the Open Science Community

August 18, 2025

Earthmover is sponsoring Ocean Hack Week 2025, providing financial support for participant travel and an Arraylake organization to empower the open ocean science community.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

From Files to Datasets: FM-301 and the Future of Radar Interoperability

From Files to Datasets: FM-301 and the Future of Radar Interoperability

July 30, 2025

An introduction to the WMO FM-301 standard for weather radar data and how open-source tools like Xradar are turning fragmented binary radar files into structured, analysis-ready datasets.

Alfonso Ladino-Rincon

Alfonso Ladino-Rincon

Data Scientist

Icechunk 1.0: Production-Grade Cloud-Native Array Storage Is Here

Icechunk 1.0: Production-Grade Cloud-Native Array Storage Is Here

July 10, 2025

Icechunk 1.0 is now stable and production-ready, bringing transactional safety, efficient versioning, high-performance Rust-based I/O, and virtual references for HDF5 and NetCDF to cloud-native array storage. The release includes manifest splitting, distributed writes, conflict resolution, and a 30 TB ERA5 sample dataset.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Meet the Earthmover Team at SciPy 2025 in Tacoma!

Meet the Earthmover Team at SciPy 2025 in Tacoma!

July 2, 2025

The Earthmover team is attending SciPy 2025 in Tacoma, Washington, with a tutorial on Xarray DataTree and Zarr, multiple talks and posters on Icechunk and Xarray, and a booth showcasing the Earthmover Platform.

Joe Hamman

Joe Hamman

CTO & Co-founder

The Untapped Promise of Weather Radar Data

The Untapped Promise of Weather Radar Data

June 30, 2025

Weather radar captures rich four-dimensional atmospheric data, but legacy binary formats and fragmented archives make large-scale analysis painfully difficult. A modern, cloud-native data model could unlock radar's vast scientific potential.

Alfonso Ladino-Rincon

Alfonso Ladino-Rincon

Data Scientist

Ergonomic seasonal grouping and resampling in Xarray

Ergonomic seasonal grouping and resampling in Xarray

June 18, 2025

Xarray introduces SeasonGrouper and SeasonResampler, two new Grouper objects that enable custom, overlapping, and variable-length seasonal aggregations without workarounds.

Deepak Cherian

Deepak Cherian

Forward Deployed Engineer

Announcing Fine-Grained Access Controls

Announcing Fine-Grained Access Controls

June 10, 2025

Arraylake now supports fine-grained, repository-level permissions with admin, write, and read privilege levels, giving teams precise control over data access and secure sharing with external collaborators.

Brian Davis

Brian Davis

Software Engineer

Cloud-native platforms are the natural evolution of atmosphere-ocean open-data practice

Cloud-native platforms are the natural evolution of atmosphere-ocean open-data practice

June 9, 2025

Open-data practice in ocean/atmosphere sciences is approximately 170 years old! While it is easy to exclaim, "weather/climate are global, of course we must share data", the actual story is anything but. That story holds valuable inspiration that we can draw from as we face a significant reduction in US climate science research.

Deepak Cherian

Deepak Cherian

Forward Deployed Engineer

Xarray for Biology

Xarray for Biology

June 6, 2025

Xarray's labeled, multidimensional data structures can solve common pain points in biological data analysis, from tracking microscopy metadata to managing complex genomic datasets. Adoption has been limited by awareness, technical rough edges, and lack of tool integration, but the community is actively working to change that.

Ian Hunt-Isaak

Ian Hunt-Isaak

Xarray Community Developer

Everything you need to know about Icechunk garbage collection

Everything you need to know about Icechunk garbage collection

May 30, 2025

A practical guide to Icechunk's garbage collection and expiration operations, explaining when and how to safely reclaim storage from unused snapshots and dangling objects.

Sebastian Galkin

Sebastian Galkin

Staff Engineer

Fundamentals: What Is Zarr? A Cloud-Native Format for Tensor Data

Fundamentals: What Is Zarr? A Cloud-Native Format for Tensor Data

May 20, 2025

Zarr is an open-source, cloud-native protocol for storing chunked, compressed N-dimensional arrays. This guide covers how Zarr works, its ecosystem of tools like Xarray and Icechunk, and when to use it for large-scale scientific and ML data.

Lindsey Nield

Lindsey Nield

Software Engineer

Icechunk: Efficient storage of versioned array data

Icechunk: Efficient storage of versioned array data

May 14, 2025

Icechunk stores versioned array data efficiently by never copying or rewriting existing chunks, so each new version only consumes storage for the data that actually changed. Older versions can be expired and garbage-collected when they are no longer needed.

Sebastian Galkin

Sebastian Galkin

Staff Engineer

TensorOps: Scientific Data Doesn't Have to Hurt

TensorOps: Scientific Data Doesn't Have to Hurt

May 8, 2025

Scientific data pipelines are plagued by data swamps, duplicated code, fragile workflows, and siloed teams. TensorOps is a vision for modern practices that bring collaboration, velocity, and reliability to scientific data engineering.

Brian Davis

Brian Davis

Software Engineer

Zarr takes Cloud-Native Geospatial by storm

Zarr takes Cloud-Native Geospatial by storm

May 6, 2025

At the 2025 Cloud-Native Geospatial conference, Zarr adoption was surging across the geospatial domain, with Copernicus Sentinel, USGS Landsat, Google Earth Engine, and ESRI ArcGIS all embracing the format for cloud-optimized array data.

Joe Hamman

Joe Hamman

CTO & Co-founder

Meet the Earthmover Team at the Cloud Native Geospatial Conference 2025!

Meet the Earthmover Team at the Cloud Native Geospatial Conference 2025!

April 28, 2025

The Earthmover team is attending the Cloud Native Geospatial Conference 2025 in Snowbird, Utah, leading a hands-on Zarr, Icechunk, and Xarray workshop and presenting talks on cloud-native datacube workflows.

Joe Hamman

Joe Hamman

CTO & Co-founder

Learning about Icechunk consistency with a clichéd but instructive example

Learning about Icechunk consistency with a clichéd but instructive example

April 23, 2025

A practical walkthrough of how Icechunk uses transactions and conflict detection to guarantee data consistency when multiple processes write concurrently. The post demonstrates optimistic concurrency control and the rebase workflow using a bank-account transfer example.

Sebastian Galkin

Sebastian Galkin

Staff Engineer

Fundamentals: What is Cloud-Optimized Scientific Data?

Fundamentals: What is Cloud-Optimized Scientific Data?

April 17, 2025

Why traditional scientific file formats like NetCDF perform poorly on cloud object storage, and how cloud-optimized formats like Zarr and Icechunk solve the problem by separating metadata and chunking data.

Tom Nicholas

Tom Nicholas

Software Engineer

Announcing Flux: The API Layer for Geospatial Data Delivery

Announcing Flux: The API Layer for Geospatial Data Delivery

April 15, 2025

Earthmover introduces Flux, a managed API layer that serves geospatial data from Arraylake via standard protocols like WMS, EDR, and OPeNDAP, eliminating the need for teams to build and maintain custom data delivery infrastructure.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Exploring Icechunk scalability: untangling S3's prefix story

Exploring Icechunk scalability: untangling S3's prefix story

April 10, 2025

Demystifying how S3 prefix sharding actually works and demonstrating that Icechunk can scale to hundreds of thousands of requests per second, far beyond the single-prefix limit.

Sebastian Galkin

Sebastian Galkin

Staff Engineer

Fundamentals: Tensors vs. Tables

Fundamentals: Tensors vs. Tables

April 3, 2025

Multidimensional array data about the physical world is fundamentally incompatible with the tabular data model. Benchmarks show that array-native tools like Xarray and Zarr outperform DuckDB and Parquet by up to 10x for common weather data queries.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Solving NASA's Cloud Data Dilemma: How Icechunk Revolutionizes Earth Data Access

Solving NASA's Cloud Data Dilemma: How Icechunk Revolutionizes Earth Data Access

March 27, 2025

Earthmover and Development Seed partnered with NASA to pilot Icechunk, an open-source tensor storage engine that enables 100x faster cloud-native data access for archival Earth science datasets without costly data migration.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

How Our Customers Use NOAA Data

How Our Customers Use NOAA Data

March 17, 2025

Earthmover customers share how NOAA climate and weather data powers their businesses, from wildfire risk modeling and energy trading to carbon market ratings and precipitation enhancement.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Accelerating Xarray with Zarr-Python 3

Accelerating Xarray with Zarr-Python 3

February 20, 2025

zarr-python’s performance paradox Last month, we released Zarr-Python 3.0 - a ground-up rewrite of the library (read more about it in this post). Beyond the exciting new features in Zarr V3, we put a lot of work into addressing some long standing performance issues with Zarr-Python 2. With the improvements described in this blog post, we’ve achieved a 14x speedup in loading the ARCO ERA5 dataset! Zarr-Python 2 had a paradoxical performance quirk; although the library could generate massive petabyte-scale datasets, it struggled to perform well when managing large or highly nested hierarchies. For example, listing the contents of a large Zarr group could be painfully slow, particularly if that Zarr group was stored on a high latency storage backend. Zarr users would experience this as long

Davis Bennet

Davis Bennet

Software Engineer

Zarr-Python 3 is here!

Zarr-Python 3 is here!

January 9, 2025

Zarr-Python 3.0 is released with full support for the Zarr V3 specification, chunk-sharding for more flexible storage, major performance improvements from a fully asynchronous core, and a modernized extensible codebase.

Joe Hamman

Joe Hamman

CTO & Co-founder

Announcing Icechunk!

Announcing Icechunk!

October 15, 2024

Earthmover announces Icechunk, an open-source transactional storage engine for Zarr that brings ACID transactions, time travel, data versioning, and high-performance Rust-based I/O to multidimensional array data in cloud object storage.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Vector data cubes in Xarray

Vector data cubes in Xarray

July 29, 2024

Vector data cubes extend the familiar raster data cube concept to geospatial vector data, using arrays indexed by geometries instead of gridded coordinates. The Xvec package brings this capability to Xarray, enabling powerful multidimensional analysis of point, line, and polygon data.

Emma Marshall

Emma Marshall

Software Engineer

Case Study: ALIVE at The University of Wisconsin-Madison

Case Study: ALIVE at The University of Wisconsin-Madison

July 1, 2024

The ALIVE research team at UW-Madison uses Arraylake to manage GOES-R satellite data for near real-time carbon and water flux estimation, benefiting from version control, ACID transactions, and seamless remote collaboration.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

A Serverless Approach to Building Planetary-Scale EO Datacubes in Zarr

A Serverless Approach to Building Planetary-Scale EO Datacubes in Zarr

June 5, 2024

A practical guide to building planetary-scale Earth observation datacubes in Zarr using serverless computing, comparing frameworks like Coiled, Modal, and Lithops for massively parallel satellite image processing.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Toward Zarr-Python 3.0

Toward Zarr-Python 3.0

May 9, 2024

The Zarr-Python project is undergoing a major refactor toward version 3.0, bringing full support for the Zarr V3 specification, new asynchronous APIs for better performance, and a modernized plugin system for codecs and storage backends.

Joe Hamman

Joe Hamman

CTO & Co-founder

Case Study: Sylvera

Case Study: Sylvera

April 12, 2024

Carbon market ratings company Sylvera adopted Arraylake to centralize millions of scattered geotiff files into cloud-optimized arrays, enabling incremental data ingestion and version-tracked auditing across their geospatial pipelines.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Cloud native data loaders for machine learning using Zarr and Xarray

Cloud native data loaders for machine learning using Zarr and Xarray

March 14, 2024

A practical guide to building a high-performance PyTorch dataloader that streams Zarr data directly from cloud storage using Xarray, Xbatcher, and Dask, achieving a 15x speedup over naive approaches.

Joe Hamman

Joe Hamman

CTO & Co-founder

Earthmover and Pangeo at AGU 2023

Earthmover and Pangeo at AGU 2023

December 5, 2023

Earthmover will be at AGU 2023 in booth 1007 alongside Coiled and Pangeo, demoing Arraylake and presenting three talks on cloud-native scientific data workflows.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Arraylake Now Available in Private Beta

Arraylake Now Available in Private Beta

October 4, 2023

Earthmover launches Arraylake in private beta, a cloud-native data lake platform purpose-built for multidimensional arrays with a built-in data catalog, ACID transactions, version control, and virtual file support.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder

Earthmover is hiring

Earthmover is hiring

October 10, 2022

Earthmover is hiring two founding engineers to help build a modern data stack for science, tackling climate and planetary challenges with cloud-native software.

Joe Hamman

Joe Hamman

CTO & Co-founder

Why we started Earthmover

Why we started Earthmover

October 10, 2022

Earthmover was founded to build a modern cloud data stack for scientific data, inspired by the success of the Pangeo open-source community and the urgent need for better tooling around multidimensional array datasets in climate tech and beyond.

Ryan Abernathey

Ryan Abernathey

CEO & Co-founder