-
-
Notifications
You must be signed in to change notification settings - Fork 350
Write chunks with negative zero values and a zero fill value #3216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
d4c1205
to
2745b68
Compare
2745b68
to
7d6d74b
Compare
Oh, oops, thanks 😅 |
7d6d74b
to
c4904e3
Compare
this test failure seems significant: https://github.com/zarr-developers/zarr-python/actions/runs/16172926021/job/45650861381?pr=3216#step:8:420 |
Yes looks like this approach doesn't work for complex number types |
what if we view the array as raw bytes (should be cheap) and compare the raw bytes? >>> import numpy as np
>>> np.array([0.0]) == np.array([-0.0])
array([ True])
>>> np.array([0.0]).view('V') == np.array([-0.0]).view('V')
array([False]) |
I wonder if that would somehow break with floating point subnormal-s and the like. Will have to experiment 🤔 |
Co-authored-by: Davis Bennett <davis.v.bennett@gmail.com>
Took me a bit, but finally got around to it. Subnormals are fine, and behave as expected; the only difference between Python's float equality and bitwise float equality is that signed zeroes compare as un-equal when comparing their bits, and that nan numbers can sometimes compare as equal when comparing their bits; the former is exactly what we want, and the latter won't occur since the code path is triggered only for signed zero fill values. >>> import numpy as np
>>> np.array(1e-323).view('V') == np.array(0.0).view('V'), 1e-323 == 0.0
(array(False), False)
>>> np.array(1e-324).view('V') == np.array(0.0).view('V'), 1e-324 == 0.0
(array(True), True)
>>> np.array(-1e-323).view('V') == np.array(-0.0).view('V'), -1e-323 == -0.0
(array(False), False)
>>> np.array(-1e-324).view('V') == np.array(-0.0).view('V'), -1e-324 == -0.0
(array(True), True)
>>> np.array(-0.0).view('V') == np.array(0.0).view('V'), 0.0 == -0.0
(array(False), True)
>>> np.inf * 0.0
nan
>>> np.array(np.nan).view('V') == np.array(np.nan).view('V'), np.nan == np.nan
(array(True), False)
>>> np.array(np.inf * 0.0).view('V') == np.array(np.nan).view('V'), np.inf * 0.0 == np.nan
(array(False), False) |
This is actually potentially super useful, because the zarr v3 spec distinguishes between different types of nans, even though numpy does not. In order to ensure that arrays round-trip correctly through zarr python, we need to generate exactly the specific nan defined in the metadata. I did a quick check and numpy will preserve the underlying byte representation of different nans, so this should be possible. np.array([b'\x00\x00\x00\x00\x00\x00\xFF\xFF'], dtype='|V8').view('float').view('V')
array([b'\x00\x00\x00\x00\x00\x00\xFF\xFF'], dtype='|V8') |
Oh, that's curious! Probably not something I can quite incorporate in the code here... unless we make all floating point arrays use bitwise comparison for empty chunks.. 🤔 |
0a596eb
to
919be15
Compare
f01d3c0
to
dba8b0b
Compare
That |
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## main #3216 +/- ##
===========================================
- Coverage 94.73% 60.74% -34.00%
===========================================
Files 78 78
Lines 8646 9412 +766
===========================================
- Hits 8191 5717 -2474
- Misses 455 3695 +3240
🚀 New features to boost your workflow:
|
Fixes #3144.
Using
np.any(self._data)
was inspired by how Zarr v2 checks for equality with a falsey fill value.TODO:
docs/user-guide/*.rst
changes/