You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While investigating possible HDF storage scenarios for scalar values from #523 I discovered that HDF also supports a "compact" storage layout where extremely small datasets or values (<64KB) are inlined into the file header https://support.hdfgroup.org/documentation/hdf5/latest/_l_b_dset_layout.html. The HDF5 lib has no support for inferring the offset and size of datasets stored using the compact layout so we have no way of creating a ChunkManifest for them and should raise an unsupported exception.
Create a test fixture with a scalar stored in the compact storage layout using the low-level h5py.h5d API.
Update HDFVirtualBackend to check the dataset's storage layout.
The text was updated successfully, but these errors were encountered:
Hmm, this is potentially an issue with the whole "readers as creators of ManifestStores" idea. We can put this inlined data into a virtual dataset and into Icechunk, it just can't be a virtual variable. (Or at least the HDF library won't help us if we want to make that virtual variable.)
If reader implementations had the ability to say "nah actually you're getting this variable in memory" then we could deal with this situation gracefully.
A compromise might be to have the error message suggest explicitly loading that particular variable.
@TomNicholas I think including the suggestion to load the problematic variables is probably the way to go 👍. I'm hopeful that this is will be a fairly infrequent case.
BTW a similar thing would happen if you try to load kerchunk references that are inlined into the kerchunk reference file. But in that case it's easier to generate a reference to the data (which lives in the kerchunk json file).
While investigating possible HDF storage scenarios for scalar values from #523 I discovered that HDF also supports a "compact" storage layout where extremely small datasets or values (<64KB) are inlined into the file header https://support.hdfgroup.org/documentation/hdf5/latest/_l_b_dset_layout.html. The HDF5 lib has no support for inferring the offset and size of datasets stored using the compact layout so we have no way of creating a
ChunkManifest
for them and should raise an unsupported exception.h5py.h5d
API.HDFVirtualBackend
to check the dataset's storage layout.The text was updated successfully, but these errors were encountered: