-
-
Notifications
You must be signed in to change notification settings - Fork 329
Column-major arrays are returned as row-major #3072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
edit i think this may not be a bug, rather a not super well documented change in behavior for zarr format 3 vs 2, making the order of arrays read into memory a runtime concern.
zarr format ordering This is pretty clearly not the correct behavior to silently drop this user input. I think the issue is because that in zarr format 3 the
However, it's pretty opaque to me how to actually implement that with zarr-python 3. My best guess would is to add:
|
These tests have some clues about what the intended behavior is: zarr-python/tests/test_codecs/test_transpose.py Lines 14 to 19 in 5710726
though so seemingly the read byte order is set via the global config rather than as it was saved on disk? zarr-python/tests/test_codecs/test_transpose.py Lines 44 to 46 in 5710726
Indeed doing so passes your assert: group = zarr.open_group("column-major-example.zarr/", mode="r", )
with config.set({"array.order": 'F'}):
array = group["example"]
assert array[:].flags["F_CONTIGUOUS"] at a minimum I think the docs could be improved to specify what a user should expect. Naively I would have expected that if I wrote with |
@ianhi - thanks for surfacing the relevant documentation. A minor snippet in the migration guide might be helpful. Perhaps the following:
I'd like to better understand what's happening with F-contiguous arrays under the hood. Barring compression and other codecs, are bytes serialized in the same order as the in-memory layout? If so, we might expect the size of the compressed chunk to be different depending on C- or F-style arrays. The following demonstrates that a random array with everything equivalent, except the order, results in two compressed chunks written to disk with the same size. Does this imply that regardless of an array's memory layout, it's always re-ordered to C-contiguous during serialization?
|
This should at least be giving you a warning - see #2948. |
Zarr version
v3.0.8
Numcodecs version
0.16.0
Python Version
3.12.5
Operating System
macOS
Installation
uv pip into virtual environment
Description
I recall using fortran-style arrays with zarr v2 with no issue. Since zarr v3, I've observed that fortran-style arrays are read back as C-style arrays. If this is in fact a bug, and not an error on my part, I'd be happy to contribute a fix.
Steps to reproduce
Additional output
No response
The text was updated successfully, but these errors were encountered: