Skip to content

Conversation

@jsignell
Copy link
Member

@jsignell jsignell commented Dec 12, 2025

This PR makes the handling of chunks="auto" consistent between open_zarr and open_dataset(..., engine="zarr").

The handling of chunks still differs in open_zarr vs open_dataset(..., engine="zarr") in that the default in open_zarr is to use chunks={} and a chunk manager (aka dask) when available in your env. And in open_dataset the default is to use chunks=None (aka no chunks).

@github-actions github-actions bot added topic-backends topic-zarr Related to zarr storage library io labels Dec 12, 2025
@jsignell jsignell self-assigned this Dec 12, 2025
@jsignell jsignell marked this pull request as ready for review December 12, 2025 19:10
@jsignell jsignell requested review from dcherian and slevang December 17, 2025 18:57
@jsignell jsignell requested a review from keewis December 22, 2025 19:24
from_array_kwargs = {}

if chunks == "auto":
if chunks is _default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's issue a DeprecationWarning saying the default will switch to chunks=None to match open_dataset. If they want the current behaviour with dask et al, users should pass in chunks={}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I was actually thinking that we not do that part. The issue that we are trying to fix with this PR is really that chunks="auto" means different things in open_zarr and open_dataset(,,, engine="zarr"). That was the part that felt deeply surprising to me and @norlandrhagen. As long as we fix that I don't think we need to change the default value.

Copy link
Contributor

@dcherian dcherian Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to merging this. It'd be nice to have the default be chunks=None always just like open_dataset. Otherwise we get behaviour that's dependent on the installed packages which is just frustrating

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

io topic-backends topic-zarr Related to zarr storage library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent chunking between xr.open_zarr and xr.open_dataset(..., engine='zarr') with chunks="auto"

4 participants