Skip to content

Write chunks with negative zero values and a zero fill value #3216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

bojidar-bg
Copy link

@bojidar-bg bojidar-bg commented Jul 8, 2025

Fixes #3144.

Using np.any(self._data) was inspired by how Zarr v2 checks for equality with a falsey fill value.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.rst
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jul 8, 2025
@bojidar-bg bojidar-bg force-pushed the 3144-negative-zero branch from d4c1205 to 2745b68 Compare July 8, 2025 11:48
@github-actions github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Jul 8, 2025
@bojidar-bg bojidar-bg force-pushed the 3144-negative-zero branch from 2745b68 to 7d6d74b Compare July 8, 2025 11:49
@bojidar-bg
Copy link
Author

Oh, oops, thanks 😅

@bojidar-bg bojidar-bg force-pushed the 3144-negative-zero branch from 7d6d74b to c4904e3 Compare July 8, 2025 11:53
@d-v-b
Copy link
Contributor

d-v-b commented Jul 11, 2025

this test failure seems significant: https://github.com/zarr-developers/zarr-python/actions/runs/16172926021/job/45650861381?pr=3216#step:8:420

@dcherian
Copy link
Contributor

this test failure seems significant

Yes looks like this approach doesn't work for complex number types

@d-v-b
Copy link
Contributor

d-v-b commented Jul 11, 2025

what if we view the array as raw bytes (should be cheap) and compare the raw bytes?

>>> import numpy as np
>>> np.array([0.0]) == np.array([-0.0])
array([ True])
>>> np.array([0.0]).view('V') == np.array([-0.0]).view('V')
array([False])

@bojidar-bg
Copy link
Author

I wonder if that would somehow break with floating point subnormal-s and the like. Will have to experiment 🤔

@dstansby dstansby added this to the 3.1.2 milestone Jul 31, 2025
Co-authored-by: Davis Bennett <davis.v.bennett@gmail.com>
@bojidar-bg
Copy link
Author

Took me a bit, but finally got around to it. Subnormals are fine, and behave as expected; the only difference between Python's float equality and bitwise float equality is that signed zeroes compare as un-equal when comparing their bits, and that nan numbers can sometimes compare as equal when comparing their bits; the former is exactly what we want, and the latter won't occur since the code path is triggered only for signed zero fill values.

>>> import numpy as np
>>> np.array(1e-323).view('V') == np.array(0.0).view('V'), 1e-323 == 0.0
(array(False), False)
>>> np.array(1e-324).view('V') == np.array(0.0).view('V'), 1e-324 == 0.0
(array(True), True)
>>> np.array(-1e-323).view('V') == np.array(-0.0).view('V'), -1e-323 == -0.0
(array(False), False)
>>> np.array(-1e-324).view('V') == np.array(-0.0).view('V'), -1e-324 == -0.0
(array(True), True)
>>> np.array(-0.0).view('V') == np.array(0.0).view('V'), 0.0 == -0.0
(array(False), True)
>>> np.inf * 0.0
nan
>>> np.array(np.nan).view('V') == np.array(np.nan).view('V'), np.nan == np.nan
(array(True), False)
>>> np.array(np.inf * 0.0).view('V') == np.array(np.nan).view('V'), np.inf * 0.0 == np.nan
(array(False), False)

@d-v-b
Copy link
Contributor

d-v-b commented Aug 1, 2025

nan numbers can sometimes compare as equal when comparing their bits

This is actually potentially super useful, because the zarr v3 spec distinguishes between different types of nans, even though numpy does not. In order to ensure that arrays round-trip correctly through zarr python, we need to generate exactly the specific nan defined in the metadata. I did a quick check and numpy will preserve the underlying byte representation of different nans, so this should be possible.

np.array([b'\x00\x00\x00\x00\x00\x00\xFF\xFF'], dtype='|V8').view('float').view('V')
array([b'\x00\x00\x00\x00\x00\x00\xFF\xFF'], dtype='|V8')

@bojidar-bg
Copy link
Author

Oh, that's curious! Probably not something I can quite incorporate in the code here... unless we make all floating point arrays use bitwise comparison for empty chunks.. 🤔

@bojidar-bg
Copy link
Author

bojidar-bg commented Aug 1, 2025

That .view("V") trick fails on GPU with a ZeroDivisionError, presumably in cupy:_core/core.pyx:81 when v_is == dtype.itemsize == 0, as is the case for the "V" dtype...
Use structured np.void dtypes to achieve the same idea doesn't work because np.void isn't hashable, but it works when using a sized "V" dtype, like "V16". Let's see if that's enough to get all tests green 😂 YES IT IS! 🎉 That GPU test was stubborn! 😂😂

Copy link

codecov bot commented Aug 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.74%. Comparing base (378d5af) to head (dba8b0b).
⚠️ Report is 49 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (378d5af) and HEAD (dba8b0b). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (378d5af) HEAD (dba8b0b)
11 10
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #3216       +/-   ##
===========================================
- Coverage   94.73%   60.74%   -34.00%     
===========================================
  Files          78       78               
  Lines        8646     9412      +766     
===========================================
- Hits         8191     5717     -2474     
- Misses        455     3695     +3240     
Files with missing lines Coverage Δ
src/zarr/core/buffer/core.py 30.98% <100.00%> (-51.59%) ⬇️

... and 69 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Negative zero not preserved
4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy