Skip to content

Refactor CodecPipeline for flexibility #3051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
TomAugspurger opened this issue May 9, 2025 · 0 comments
Open

Refactor CodecPipeline for flexibility #3051

TomAugspurger opened this issue May 9, 2025 · 0 comments
Labels
bug Potential issues with the zarr-python library

Comments

@TomAugspurger
Copy link
Contributor

Zarr version

v3

Numcodecs version

na

Python Version

na

Operating System

na

Installation

na

Description

Currently, the CodecPipeline interface works by passing around Iterable[tuple[...]] for various types of tuples. For example decode:

chunks_and_specs: Iterable[tuple[CodecOutput | None, ArraySpec]],

  • decode: Iterable[tuple[CodecOutput | None, ArraySpec]]
  • encode: Iterable[tuple[CodecInput | None, ArraySpec]]
  • read: Iterable[tuple[ByteGetter, ArraySpec, SelectorTuple, SelectorTuple, bool]]
  • write: Iterable[tuple[ByteSetter, ArraySpec, SelectorTuple, SelectorTuple, bool]]

At the moment, we have no way to evolve the interface in a backwards compatible way. #2845 noted an accidental API break.

One option for gracefully evolving the spec here, which I might need for #2904, is to replace the tuples with dataclasses. We can safely add new optional fields to the dataclass without breaking backwards compatibility.

We can define __len__ and __iter__ on the dataclasses and freeze their return values to the current API.

@dataclass(frozen=True, eq=True)
class DecodeChunksAndSpecs:
    codec_output: CodecOutput | None
    array_spec: ArraySpec

    def __len__(self): return 2
    def __iter__(self):
        yield self.codec_output
        yield self.array_spec

And potentially we would warn when accessing the fields through iteration or position, to encourage pipeline implementations to migrate to the new system.

Steps to reproduce

na

Additional output

No response

@TomAugspurger TomAugspurger added the bug Potential issues with the zarr-python library label May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

1 participant