(fix): use `typesize` on `Blosc` codec #2962

ilan-gold · 2025-04-07T14:52:12Z

Fixes #2766 and fixes #2171
TODO:

Add unit tests and/or doctests in docstrings
Add docstrings and API docs for any new/modified user-facing classes and functions
New/modified features documented in docs/user-guide/*.rst
Changes documented as a new file in changes/
GitHub Actions have all passed
Test coverage is 100% (Codecov passes)

d-v-b

this looks good @ilan-gold, we just a release note

ilan-gold · 2025-04-07T18:59:23Z

Apologies did not mean for you guys to do an immediate review, will keep that in mind next time, this was mostly to remind myself to finish up :)

d-v-b · 2025-04-07T19:00:32Z

no worries, I was trigger-happy here

dstansby

Is it worth adding a test for this that asserts a certain compressed size, so we can catch regressions in the future? The doctest is a nice way to catch this, but I worry that it might get removed or changed whereas a test is more likely to stay around.

…hon into ig/typesize_for_blosc

ilan-gold · 2025-05-08T12:36:01Z

Does anyone have access to a windows machine? Or should we just xfail this and move on? I am not sure if the issue is numpy or the python version interacting with numcodecs here causing the sizes to be off. We can open an issue if someone can come up with access to a windows machine can create a repro

dstansby · 2025-05-08T13:31:48Z

tests/test_codecs/test_blosc.py

+
+
+async def test_typesize() -> None:
+    a = np.arange(1000000)


Suggested change

a = np.arange(1000000)

a = np.arange(2**16, dtype=np.uint16)

As a thought, worth explicitly specifying the data type (and making the data smaller)? Don't know if it will fix the windows issue, but I think worth doing anyway os there's a concrete bytesize, and perhaps using integer data type will help with linux/windows because perhaps they have different floating point implementations (although that's wild speculation on my part...)

arange is default uint64 so I'll push something with that added.

ilan-gold · 2025-05-08T13:46:32Z

The first four dumped characters on the failing windows case are:

\x02\x01\x91\x04@

while they should be

\x02\x01\x91\x08

This would indicate to me the typesize is being incorrectly encoded but I (a) don't know why and (b) don't know what the @ means.

dstansby · 2025-05-08T13:51:07Z

This would indicate to me the typesize is being incorrectly encoded but I (a) don't know why and (b) don't know what the @ means.

Weird... I'm guessing that would be an upstream numcodecs issue/fix, so we could probably cut our losses here and just xfail the test on windows for now.

ilan-gold · 2025-05-08T14:02:05Z

Ok @dstansby great call - it looks like it was just being explicit, there must be different behavior on windows for that version. There is a warning in the documentation https://numpy.org/doc/stable/reference/generated/numpy.arange.html but I figured we hadn't actually hit any of those conditions.

rabernat · 2025-05-08T14:58:28Z

Thank you @ilan-gold and @dstansby for working on this bug! I really appreciate your efforts. 🙏

(fix): use typesize on Blosc codec

0c45340

github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Apr 7, 2025

d-v-b requested a review from normanrz April 7, 2025 14:56

normanrz approved these changes Apr 7, 2025

View reviewed changes

d-v-b requested changes Apr 7, 2025

View reviewed changes

(chore): relnote

3074bf2

github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Apr 7, 2025

d-v-b approved these changes Apr 7, 2025

View reviewed changes

ilan-gold added 2 commits April 7, 2025 20:48

(fix): intersphinx

386f09f

(fix): look at that compression ratio!

e7c5b00

dstansby reviewed Apr 9, 2025

View reviewed changes

ilan-gold added 2 commits April 9, 2025 16:10

(fix): add test

5e1a593

Merge branch 'main' into ig/typesize_for_blosc

eb6f173

ilan-gold requested a review from dstansby April 9, 2025 14:11

ilan-gold and others added 11 commits April 9, 2025 16:23

(fix): min version

9fe74b8

Merge branch 'ig/typesize_for_blosc' of github.com:ilan-gold/zarr-pyt…

927a2bc

…hon into ig/typesize_for_blosc

(fix): parenthesis?

6f0feca

(fix): try assertion error

25ece76

(fix): windows size

7bf71b4

Merge branch 'main' into ig/typesize_for_blosc

4ac85bb

(fix): add bytes print

3c9c6cc

Merge branch 'ig/typesize_for_blosc' of github.com:ilan-gold/zarr-pyt…

d4daa79

…hon into ig/typesize_for_blosc

Merge branch 'main' into ig/typesize_for_blosc

03f298a

(fix): aghh windows latest is correct, error for non latest

45693bf

Merge branch 'ig/typesize_for_blosc' of github.com:ilan-gold/zarr-pyt…

21950a2

…hon into ig/typesize_for_blosc

(fix): conditions for sizes

0fcc9f0

dstansby reviewed May 8, 2025

View reviewed changes

(fix): try clearer data

fa7092f

(fix): awesome!

5cfa80f

(fix): pre-commit

e5653f4

dstansby approved these changes May 8, 2025

View reviewed changes

dstansby enabled auto-merge (squash) May 8, 2025 14:31

Merge branch 'main' into ig/typesize_for_blosc

3a6895e

dstansby merged commit 5ff3fbe into zarr-developers:main May 8, 2025
30 checks passed

ilan-gold deleted the ig/typesize_for_blosc branch May 8, 2025 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

(fix): use `typesize` on `Blosc` codec #2962

(fix): use `typesize` on `Blosc` codec #2962

Uh oh!

ilan-gold commented Apr 7, 2025 •

edited

Loading

Uh oh!

d-v-b left a comment

Uh oh!

ilan-gold commented Apr 7, 2025

Uh oh!

d-v-b commented Apr 7, 2025

Uh oh!

dstansby left a comment

Uh oh!

ilan-gold commented May 8, 2025 •

edited

Loading

Uh oh!

dstansby May 8, 2025

Uh oh!

ilan-gold May 8, 2025

Uh oh!

ilan-gold commented May 8, 2025

Uh oh!

dstansby commented May 8, 2025

Uh oh!

ilan-gold commented May 8, 2025

Uh oh!

Uh oh!

rabernat commented May 8, 2025

Uh oh!

Uh oh!

Uh oh!

(fix): use typesize on Blosc codec #2962

(fix): use typesize on Blosc codec #2962

Uh oh!

Conversation

ilan-gold commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d-v-b left a comment

Choose a reason for hiding this comment

Uh oh!

ilan-gold commented Apr 7, 2025

Uh oh!

d-v-b commented Apr 7, 2025

Uh oh!

dstansby left a comment

Choose a reason for hiding this comment

Uh oh!

ilan-gold commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dstansby May 8, 2025

Choose a reason for hiding this comment

Uh oh!

ilan-gold May 8, 2025

Choose a reason for hiding this comment

Uh oh!

ilan-gold commented May 8, 2025

Uh oh!

dstansby commented May 8, 2025

Uh oh!

ilan-gold commented May 8, 2025

Uh oh!

Uh oh!

rabernat commented May 8, 2025

Uh oh!

Uh oh!

(fix): use `typesize` on `Blosc` codec #2962

(fix): use `typesize` on `Blosc` codec #2962

ilan-gold commented Apr 7, 2025 •

edited

Loading

ilan-gold commented May 8, 2025 •

edited

Loading