-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
py/formatfloat: Improve accuracy of float formatting code. #17444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Code size report:
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #17444 +/- ##
==========================================
- Coverage 98.44% 98.44% -0.01%
==========================================
Files 171 171
Lines 22192 22231 +39
==========================================
+ Hits 21847 21885 +38
- Misses 345 346 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6efb905
to
ef61c67
Compare
Additional results after a few more experiments in double-precision :
|
Thanks for this. I will study it in detail. I also found the corresponding CPython issue and long discussion about this topic: python/cpython#45921 . There's a lot of useful information there. It looks like this is a hard thing to get right. |
I read that thread, it is indeed an interesting input. But I am afraid that we will not be able to use much of it due to ressource constraints in MicroPython. |
Out of curiosity, I have been investigating a bit more... Half of these remaining errors are caused by About 20% of the remaining errors are caused by large negative exponents (exp < -255). There is probably something we could improve there as well, but I am not sure it is a big concern for MicroPython if such seldom-used numbers have accuracy problems in I am also looking at cleaning a bit the code, to reduce code footprint. While doing this, one things that I noticed is that the function includes lots of checks on |
Yes, that makes a lot of sense. |
Related: #6024 has been around for a while and attempt to improve the other side of this, ie parsing of floats, using higher precision ints. |
Thanks, I will look at it as well. I have a big update to submit for this PR, which I believe makes things look much better. |
fb265b8
to
9652e45
Compare
I have just force-pushed my new code. import array, math, time, binascii
seed = 42
def pseudo_randouble():
global seed
ddef = []
for _ in range(8):
ddef.append(seed & 0xFF)
seed = binascii.crc32(b'\0', seed)
arr = array.array('d', bytes(ddef))
return ddef, arr[0]
# The largest errors come from seldom used very small numbers, near the
# limit of the representation. So we keep them out of this test to keep
# the max relative error display useful.
if float('1e-100') == 0:
# single-precision
min_expo = -96 # i.e. not smaller than 1.0e-29
# Expected results:
# HIGH_QUALITY_REPR=1: 99.71% exact conversions, relative error < 1e-7
# HIGH_QUALITY_REPR=0: 94.89% exact conversions, relative error < 1e-6
else:
# double-precision
min_expo = -845 # i.e. not smaller than 1.0e-254
# Expected results:
# HIGH_QUALITY_REPR=1: 99.83% exact conversions, relative error < 2.7e-16
# HIGH_QUALITY_REPR=0: 64.01% exact conversions, relative error < 1.1e-15
ttime = 0
stats = 0
N = 10000000
max_err = 0
for _ in range(N):
(ddef, f) = pseudo_randouble()
while f == math.isinf(f) or math.isnan(f) or math.frexp(f)[1] <= min_expo:
(ddef, f) = pseudo_randouble()
start = time.time_ns()
str_f = repr(f)
ttime += time.time_ns() - start
f2 = float(str_f)
if f2 == f:
stats += 1
else:
error = abs(f2 - f) / f
if max_err < error:
max_err = error
print("{:.19e}: repr='{:s}' err={:.4e}".format(f, str_f, error))
print("{:d} values converted in {:d} [ms]".format(N, int(ttime / 1000000)))
print("{:.2%} exact conversions, max relative error={:.2e}".format(stats / N, max_err)) It is similar to the one Damien posted before, but the version tests specifically the This new code brings the error rate to a really low level, which should be acceptable for MicroPython intended use. |
aba60a3
to
adcadc2
Compare
779f0e9
to
a7c4061
Compare
The unix / nanbox variant did originally crash at the end of the accuracy test, due to the randomly generated float number that create invalid nanbox objects. I have therefore fixed the nanbox code in |
1e547ba
to
1647452
Compare
py/mpprint.h
Outdated
#define PF_FLAG_SEP_POS (9) // must be above all the above PF_FLAGs | ||
#define PF_FLAG_USE_OPTIMAL_PREC (0x200) | ||
#define PF_FLAG_ALWAYS_DECIMAL (0x400) | ||
#define PF_FLAG_SEP_POS (16) // must be above all the above PF_FLAGs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over in the PR where the code that added PF_FLAG_SEP_POS we discussed the position of this flag. There is a port with 16-bit int
s. It was possible to position PF_FLAG_SEP_POS at 9, because there were enough bits in an unsigned 16-bit value to squeeze in _
and ,
as shifted values without widening the type of the flags argument.
It's my mistake that I didn't comment on the requirement here, nor write a check that would trigger during CI. There is an assertion about the situation in objstr.c
but because the pic16bit port is not built during CI (non-free toolchain) the assertion is only checked on platforms with 32-bit ints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Goot point. I saw your (unrelated) PR on mpprint.h
yesterday and was wondering about this.
I have just updated the code to move these new flags above the SEP
bits.
As they are strictly related to the use of floats, which are obviously not enabled in the pic16 port, this should be safe.
535a56a
to
153cbb5
Compare
While fixing |
Heads up: I found a trick to get 100% good conversions, significantly faster then CPython, and the code appears to be smaller than my previous version. Working on it... |
314df34
to
84a001f
Compare
I have pushed the code with the new REPR_C tolerance. I think this should now work on all platforms. To keep the test duration short, I have left the number of numbers to test set to 1200. As demonstrated with REPR_C, 1200 steps appear to be sufficient to detect any platform-specific discrepancy. |
86f9af4
to
11a8e87
Compare
@dpgeorge I have submitted a PR with the trivial fix for I have intentionally kept the REPR_C crafted-floats issue separate, as more time will be needed to find the most efficient implementation. The test code is already available here: edf5cab |
11a8e87
to
86fbed2
Compare
I have rebased the PR to master and removed nanbox-related changes, now that they have been integrated in master. |
This is looking very good now! My testing on PYBD-SF2 (single prec) and PYBD-SF6 (double prec) shows that everything is good, all tests pass. And around +250 bytes code size is pretty decent for the benefit provided by the approx algorithm. Definitely |
86fbed2
to
7034fad
Compare
The default settings in |
9deb316
to
b878219
Compare
OK, it looks like we are all set for the next review, the last failing test is a false positive. |
One last thing: it would be good to have a build that uses the BASIC mode. I think esp8266 is the ideal candidate for that because it has little flash available, not much computing resources, and if we have room to spare on that port it's best to use that room for other things. Could you enable BASIC on that port, in I tested and with APPROX (current situation) that port grows by +380 bytes, with BASIC it grows by +132. |
Actually, maybe that's easier said than done. Running the tests on esp8266 with BASIC enabled gives 8 failures:
Although, with APPROX there are also 7 failures:
Hmm... |
I was able to reproduce the issue with a IMPL_FLOAT 32bit unix build. The format code properly retrieves the exact I will try to solve that in the coming days. |
Erratum: the parse code works as intended. The problem is caused by REPR_C, which I had activated in my 32 bit test build. |
I started a discussion about improving the accuracy of REPR_C (or adding a new non-boxed 32-bit REPR) to avoid this kind of problem. My idea was to allocate more bits to floats at the expense of a lower limit for smallints. However, the feedback was not enthusiastic. See https://github.com/orgs/micropython/discussions/17566 |
To recap: the new formatting code works as intended: as it looks for the most accurate representation, it actually shows the "lost bits" of REPR_C (that would not have been shown with the old code). However if the APPROX code would take into account the fact that REPR_C has two bits less, it would not push the precision that high. So I will update the validation checks in the formatting code to to take into account the lost bits when REPR_C is active, and stop refining earlier. That should solve the issue. |
This commit extracts from the current float parsing code two functions which could be reused elsewhere in MicroPython. The code used to multiply a float x by a power of 10 is also simplified by applying the binary exponent separately from the power of 5. This avoids the risk of overflow in the intermediate stage, before multiplying by x. Signed-off-by: Yoctopuce dev <dev@yoctopuce.com>
Quick update: still working on it, I am almost done (but it takes a bit longer as I am out of office these days...) |
eed35f0
to
ef3c624
Compare
Following discussions in PR micropython#16666, this commit updates the float formatting code to improve the `repr` reversibility, i.e. the percentage of valid floating point numbers that do parse back to the same number when formatted by `repr`. This new code offers a choice of 3 float conversion methods, depending on the desired tradeoff between code size and conversion precision: - BASIC method is the smallest code footprint - APPROX method uses an iterative method to approximate the exact representation, which is a bit slower but but does not have a big impact on code size. It provides `repr` reversibility on >99.8% of the cases in double precision, and on >98.5% in single precision (except with REPR_C, where reversibility is 100% as the last two bits are not taken into account). - EXACT method uses higher-precision floats during conversion, which provides perfect results but has a higher impact on code size. It is faster than APPROX method, and faster than CPython equivalent implementation. It is however not available on all compilers when using FLOAT_IMPL_DOUBLE. Here is the table comparing the impact of the three conversion methods on code footprint on PYBV10 (using single-precision floats) and reversibility rate for both single-precision and double-precision floats. The table includes current situation as a baseline for the comparison: PYBV10 REPR_C FLOAT DOUBLE current = 364596 12.9% 27.6% 37.9% basic = 364712 85.6% 60.5% 85.7% approx = 364964 100.0% 98.5% 99.8% exact = 366408 100.0% 100.0% 100.0% Note that when using REPR_C, a few test cases do not pass due to the missing bits in the actual value, which are now properly reflected inthe result by the format function. Signed-off-by: Yoctopuce dev <dev@yoctopuce.com>
ef3c624
to
85122f3
Compare
I have pushed a new version which fixes most of the issues with for mant in [34567, 76543, 999999]:
for exp in range(-16, 16):
strnum = "%de%d" % (mant, exp)
print("Next number: %s" % strnum)
num = float(strnum)
for mode in ['e', 'f', 'g']:
maxprec = 16
# MicroPython has a length limit in objfloat.c
if mode == 'f' and 6 + exp + maxprec > 31:
maxprec = 31 - 6 - exp
for prec in range(0, maxprec):
fmt = "%." + str(prec) + mode
print("%5s: " % fmt, fmt % num) With the latest code, even REPR_C outperforms CPython in providing a better representation without "phantom" digits. Here is the code size and format reversibility score, including current method as a baseline for comparison:
I think there is still some room for improvement in this code, at least for one REPR_C rounding case, and to reduce the code size. But you can already give a try of the new code on esp8266 with REPR_BASIC. |
Testing the latest code here on esp8266, BASIC mode costs +208 bytes, APPROX mode costs +448 bytes. With BASIC mode, I get failures on 2 tests: FAILURE micropython/tests/results/float_float_format_ints.py
--- micropython/tests/results/float_float_format_ints.py.exp
+++ micropython/tests/results/float_float_format_ints.py.out
@@ -478,6 +478,6 @@
23456 x 10^4 with format {:.8e} gives 2.34560000e+08
23456 x 10^4 with format {:.8f} gives 234560000.00000000
23456 x 10^4 with format {:.8g} gives 2.3456e+08
-16777215.000000
+16777214.400000
2147483520.000000
1.000000e+38
FAILURE micropython/tests/results/float_float_format_accuracy.py
--- micropython/tests/results/float_float_format_accuracy.py.exp
+++ micropython/tests/results/float_float_format_accuracy.py.out
@@ -1,2 +1,2 @@
1200 values converted
-float format accuracy OK
+FAILED: repr rate=85.583% max_err=5.934e-07 With APPROX mode, I get failures on 1 test: FAILURE micropython/tests/results/float_float_format_ints.py
--- micropython/tests/results/float_float_format_ints.py.exp
+++ micropython/tests/results/float_float_format_ints.py.out
@@ -478,6 +478,6 @@
23456 x 10^4 with format {:.8e} gives 2.34560000e+08
23456 x 10^4 with format {:.8f} gives 234560000.00000000
23456 x 10^4 with format {:.8g} gives 2.3456e+08
-16777215.000000
-2147483520.000000
+16777212.000000
+2147483200.000000
1.000000e+38 But everything else passes, which is great! |
Summary
Following discussions in PR #16666, this pull request updates the float formatting code to reduce the
repr
reversibility error, i.e. the percentage of valid floating point numbers that do not parse back to the same number when formatted byrepr
.The baseline before this commit is an error rate of ~46%, when using double-precision floats.
This new code is available in two flavors, based on a preprocessor definition:
Testing
The new formatting code was tested for reversibility using the code provided by Damien in PR #16666
A variant using formats
{:.7g}
,{:.8g}
and{:.9g}
was used for single-precision testing.Compatibility with CPython on the various float formats was tested by comparing the output using the following code:
The integration tests have also found some corner cases in the new code which have been fixed.
For single-precision floats, some test cases had to be adapted:
float_format_ints
is tapping into an ill-defined partial digit of the mantissa (the 10th), which is not available in single-precision floats with the new code due to integer limitations. So the display range has been updated accordingly.float_struct_e
uses a 15-digit representation which is meaningless on single-precision floats. A separate version for double-precision has been made insteadfloat_format_ints
, there is one test case specific to single-precision floats which verifies that the largest possible mantissa value16777215
can be used to store that exact number and retrieve it as-is. Unfortunately the rounding in the simplest version of the new algorithm makes it display as a slightly different number. This would cause the CI test to fail on single-precision floats when the improved algorithm is not enabled.Trade-offs and Alternatives
It is unclear at that point if the simplest version of this improvement is worth the change:
The full version of the enhancement makes much more difference in terms of precision, both for double-precision and single-precision floats, but it causes about 20% overhead on conversion time, and makes the code a bit bigger
Looking forward to reading your feedback...
Edit 1: See #17444 (comment) for updates on accuracy results
Edit 2: Updated values in #17444 (comment)