py/formatfloat: Improve accuracy of float formatting code. #17444

yoctopuce · 2025-06-06T14:16:43Z

Summary

Following discussions in PR #16666, this pull request updates the float formatting code to reduce the repr reversibility error, i.e. the percentage of valid floating point numbers that do not parse back to the same number when formatted by repr.

The baseline before this commit is an error rate of ~46%, when using double-precision floats.

This new code is available in two flavors, based on a preprocessor definition:

In the simplest version, it reduces the error down to ~40%, using an integer representation of the decimal mantissa rather than working on floats. It is also slightly faster, and improves the rounding in some conditions.
In the most complete version, it reduces the error down to ~5%. This extra code works by iterative refinement, and makes the code slightly slower than CPython when tested on ports/unix.

Testing

The new formatting code was tested for reversibility using the code provided by Damien in PR #16666
A variant using formats {:.7g}, {:.8g} and {:.9g} was used for single-precision testing.

Compatibility with CPython on the various float formats was tested by comparing the output using the following code:

for mant in [34567, 76543]:
    for exp in range(-16, 16):
        print("Next number: %de%d" % (mant, exp))
        num = mant * (10.0**exp)
        for mode in ['e', 'f', 'g']:
            maxprec = 16
            # MicroPython has a length limit in objfloat.c
            if mode == 'f' and 6 + exp + maxprec > 31:
                maxprec = 31 - 6 - exp
            for prec in range(1, maxprec):
                fmt = "%." + str(prec) + mode
                print("%5s: " % fmt, fmt % num)

The integration tests have also found some corner cases in the new code which have been fixed.
For single-precision floats, some test cases had to be adapted:

float_format_ints is tapping into an ill-defined partial digit of the mantissa (the 10th), which is not available in single-precision floats with the new code due to integer limitations. So the display range has been updated accordingly.
similarly, float_struct_e uses a 15-digit representation which is meaningless on single-precision floats. A separate version for double-precision has been made instead
in float_format_ints, there is one test case specific to single-precision floats which verifies that the largest possible mantissa value 16777215 can be used to store that exact number and retrieve it as-is. Unfortunately the rounding in the simplest version of the new algorithm makes it display as a slightly different number. This would cause the CI test to fail on single-precision floats when the improved algorithm is not enabled.

Trade-offs and Alternatives

It is unclear at that point if the simplest version of this improvement is worth the change:

going from 46% error to 40% error in double precision is not a big improvement.
there is no improvement for single-precision
the new code is only marginally faster

The full version of the enhancement makes much more difference in terms of precision, both for double-precision and single-precision floats, but it causes about 20% overhead on conversion time, and makes the code a bit bigger

Looking forward to reading your feedback...

Edit 1: See #17444 (comment) for updates on accuracy results
Edit 2: Updated values in #17444 (comment)

github-actions · 2025-06-06T14:25:51Z

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:  +416 +0.049% standard
      stm32:  +344 +0.088% PYBV10
     mimxrt:  +536 +0.144% TEENSY40
        rp2:  +296 +0.032% RPI_PICO_W
       samd:  +392 +0.146% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:  +288 +0.064% VIRT_RV32

codecov · 2025-06-06T14:26:35Z

Codecov Report

Attention: Patch coverage is 99.59839% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.44%. Comparing base (8f8f853) to head (85122f3).
Report is 5 commits behind head on master.

Files with missing lines	Patch %	Lines
py/formatfloat.c	99.41%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #17444      +/-   ##
==========================================
- Coverage   98.44%   98.44%   -0.01%     
==========================================
  Files         171      171              
  Lines       22192    22231      +39     
==========================================
+ Hits        21847    21885      +38     
- Misses        345      346       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

yoctopuce · 2025-06-06T22:36:06Z

Additional results after a few more experiments in double-precision :

with the simple method, %17g provides the fastest repr value with ~40% error
with the elaborate method enabled, we can bring repr error down to 0.5% as follows:
1. try first %17g, and stop if a perfect match is found (90% of the cases)
2. otherwise fallback to %18g (7% of the cases)
3. otherwise fallback to %19g (remaining cases)
  The overall cost is only 20% more than the fastest repr that had 40% error.

dpgeorge · 2025-06-08T23:18:41Z

Thanks for this. I will study it in detail.

I also found the corresponding CPython issue and long discussion about this topic: python/cpython#45921 . There's a lot of useful information there. It looks like this is a hard thing to get right.

yoctopuce · 2025-06-10T19:21:17Z

I also found the corresponding CPython issue and long discussion about this topic: python/cpython#45921 . There's a lot of useful information there. It looks like this is a hard thing to get right.

I read that thread, it is indeed an interesting input. But I am afraid that we will not be able to use much of it due to ressource constraints in MicroPython.

yoctopuce · 2025-06-10T19:35:03Z

with the elaborate method enabled, we can bring repr error down to 0.5% as follows:

Out of curiosity, I have been investigating a bit more...

Half of these remaining errors are caused by {:g} falling back to {:f} for small negative exponents and therefore loosing significant digits in the mantissa. If we run the test using {:e}, the reprerror goes down to 0.25%. So I am looking at changing the code for {:f} to insert leading zeroes without integrating them in the mantissa. This should properly fix the issue and improve the accuracy of both {:f} and {:g}.

About 20% of the remaining errors are caused by large negative exponents (exp < -255). There is probably something we could improve there as well, but I am not sure it is a big concern for MicroPython if such seldom-used numbers have accuracy problems in repr.

I am also looking at cleaning a bit the code, to reduce code footprint. While doing this, one things that I noticed is that the function includes lots of checks on buf_size and buf_remaining at various places, but MicroPython always uses a 32 bytes buffer (or 16 bytes buffer for single-precision floats). Don't you think it would make sense to document this function as requiring a 32 bytes output buffer, adding an assert(buf_size >= 32) at the beginning and simplifying these checks ?

dpgeorge · 2025-06-11T01:07:37Z

Don't you think it would make sense to document this function as requiring a 32 bytes output buffer, adding an assert(buf_size >= 32) at the beginning and simplifying these checks ?

Yes, that makes a lot of sense.

dpgeorge · 2025-06-12T00:17:15Z

Related: #6024 has been around for a while and attempt to improve the other side of this, ie parsing of floats, using higher precision ints.

yoctopuce · 2025-06-12T14:56:49Z

Thanks, I will look at it as well. I have a big update to submit for this PR, which I believe makes things look much better.
But I just found an accuracy issue in the float parse code for very small numbers, so I need to fix it first and check the impact on results...

yoctopuce · 2025-06-13T14:20:16Z

I have just force-pushed my new code.
In addition to previous test code and to running the CI tests, I have been testing using the following routine:

import array, math, time, binascii

seed = 42


def pseudo_randouble():
    global seed
    ddef = []
    for _ in range(8):
        ddef.append(seed & 0xFF)
        seed = binascii.crc32(b'\0', seed)
    arr = array.array('d', bytes(ddef))
    return ddef, arr[0]


# The largest errors come from seldom used very small numbers, near the
# limit of the representation. So we keep them out of this test to keep
# the max relative error display useful.
if float('1e-100') == 0:
    # single-precision
    min_expo = -96  # i.e. not smaller than 1.0e-29
    # Expected results:
    # HIGH_QUALITY_REPR=1: 99.71% exact conversions, relative error < 1e-7
    # HIGH_QUALITY_REPR=0: 94.89% exact conversions, relative error < 1e-6
else:
    # double-precision
    min_expo = -845  # i.e. not smaller than 1.0e-254
    # Expected results:
    # HIGH_QUALITY_REPR=1: 99.83% exact conversions, relative error < 2.7e-16
    # HIGH_QUALITY_REPR=0: 64.01% exact conversions, relative error < 1.1e-15

ttime = 0
stats = 0
N = 10000000
max_err = 0
for _ in range(N):
    (ddef, f) = pseudo_randouble()
    while f == math.isinf(f) or math.isnan(f) or math.frexp(f)[1] <= min_expo:
        (ddef, f) = pseudo_randouble()

    start = time.time_ns()
    str_f = repr(f)
    ttime += time.time_ns() - start
    f2 = float(str_f)
    if f2 == f:
        stats += 1
    else:
        error = abs(f2 - f) / f
        if max_err < error:
            max_err = error
            print("{:.19e}: repr='{:s}' err={:.4e}".format(f, str_f, error))

print("{:d} values converted in {:d} [ms]".format(N, int(ttime / 1000000)))
print("{:.2%} exact conversions, max relative error={:.2e}".format(stats / N, max_err))

It is similar to the one Damien posted before, but the version tests specifically the repr function, which features additional refinements compared to a %.g format with fixed precision. This code also makes sure that no single case shows a relative error greater than the precision expected from the mantissa.

This new code brings the error rate to a really low level, which should be acceptable for MicroPython intended use.

yoctopuce · 2025-06-13T21:26:00Z

The unix / nanbox variant did originally crash at the end of the accuracy test, due to the randomly generated float number that create invalid nanbox objects.

I have therefore fixed the nanbox code in obj.h to prevent the creating bad nanobjects

jepler · 2025-06-14T06:34:02Z

py/mpprint.h

-#define PF_FLAG_SEP_POS           (9) // must be above all the above PF_FLAGs
+#define PF_FLAG_USE_OPTIMAL_PREC  (0x200)
+#define PF_FLAG_ALWAYS_DECIMAL    (0x400)
+#define PF_FLAG_SEP_POS           (16) // must be above all the above PF_FLAGs


Over in the PR where the code that added PF_FLAG_SEP_POS we discussed the position of this flag. There is a port with 16-bit ints. It was possible to position PF_FLAG_SEP_POS at 9, because there were enough bits in an unsigned 16-bit value to squeeze in _ and , as shifted values without widening the type of the flags argument.

It's my mistake that I didn't comment on the requirement here, nor write a check that would trigger during CI. There is an assertion about the situation in objstr.c but because the pic16bit port is not built during CI (non-free toolchain) the assertion is only checked on platforms with 32-bit ints.

Goot point. I saw your (unrelated) PR on mpprint.h yesterday and was wondering about this.
I have just updated the code to move these new flags above the SEP bits.
As they are strictly related to the use of floats, which are obviously not enabled in the pic16 port, this should be safe.

yoctopuce · 2025-06-14T15:03:52Z

While fixing nanbox CI to avoid a crash in case of handcrafted nan floats, I run into another small nanbox bug: math.nan was incorrectly defined as a signaling nan. Using that value caused a failure of float/math_domain.py when testing copysign. So I have included that fix as well in the PR, since this was breaking CI tests.

yoctopuce · 2025-06-14T16:53:07Z

Heads up: I found a trick to get 100% good conversions, significantly faster then CPython, and the code appears to be smaller than my previous version. Working on it...

yoctopuce · 2025-06-19T12:37:32Z

So they all pass fine at N=2000 except for the ESP8266, where the error stays too high. I wonder what's the difference using the same float code in MP. Obviously the compiler & C-libs are different.

I have pushed the code with the new REPR_C tolerance. I think this should now work on all platforms. To keep the test duration short, I have left the number of numbers to test set to 1200. As demonstrated with REPR_C, 1200 steps appear to be sufficient to detect any platform-specific discrepancy.

yoctopuce · 2025-06-19T15:39:41Z

do you prefer that I include the fix in this PR as I did for nanbox, or that I move both to a different PR ?

Please move to a different PR, this one is getting big :)

@dpgeorge I have submitted a PR with the trivial fix for math.nan and hand-crafted floats in nanbox. Once it will be integrated into master, I can remove that part of the code from this PR and rebase to the new master.

I have intentionally kept the REPR_C crafted-floats issue separate, as more time will be needed to find the most efficient implementation. The test code is already available here: edf5cab

yoctopuce · 2025-06-23T15:06:57Z

I have rebased the PR to master and removed nanbox-related changes, now that they have been integrated in master.

py/mpprint.c

py/mpconfig.h

dpgeorge · 2025-07-01T13:50:24Z

This is looking very good now! My testing on PYBD-SF2 (single prec) and PYBD-SF6 (double prec) shows that everything is good, all tests pass.

And around +250 bytes code size is pretty decent for the benefit provided by the approx algorithm.

Definitely mpy-cross should use the exact algorithm, because that benefits all targets and the cost (code size and speed) is host side. Could you enable that?

yoctopuce · 2025-07-03T06:37:26Z

Definitely mpy-cross should use the exact algorithm, because that benefits all targets and the cost (code size and speed) is host side. Could you enable that?

The default settings in mpconfig.h should enable EXACT mode automatically by default for mpy-cross when using gcc, because mpy-cross is using IMPL_DOUBLE and long double is available. On Windows, there is no true long double, so APPROX is the best we can do anyway...

py/misc.h

py/mpconfig.h

py/parsenum.c

yoctopuce · 2025-07-04T22:43:04Z

OK, it looks like we are all set for the next review, the last failing test is a false positive.

dpgeorge · 2025-07-04T22:56:32Z

One last thing: it would be good to have a build that uses the BASIC mode. I think esp8266 is the ideal candidate for that because it has little flash available, not much computing resources, and if we have room to spare on that port it's best to use that room for other things.

Could you enable BASIC on that port, in ports/esp8266/mpconfigport.h?

I tested and with APPROX (current situation) that port grows by +380 bytes, with BASIC it grows by +132.

dpgeorge · 2025-07-04T23:02:57Z

Could you enable BASIC on that port, in ports/esp8266/mpconfigport.h?

Actually, maybe that's easier said than done. Running the tests on esp8266 with BASIC enabled gives 8 failures:

FAILURE tests/results/float_float_format_ints.py
--- tests/results/float_float_format_ints.py.exp	2025-07-05 08:58:24.111382479 +1000
+++ tests/results/float_float_format_ints.py.out	2025-07-05 08:58:24.111478972 +1000
@@ -478,6 +478,6 @@
 23456 x 10^4 with format {:.8e} gives 2.34560000e+08
 23456 x 10^4 with format {:.8f} gives 234560000.00000000
 23456 x 10^4 with format {:.8g} gives 2.3456e+08
-16777215.000000
+16777214.400000
 2147483520.000000
-1.000000e+38
+9.999998e+37

FAILURE tests/results/float_float_parse.py
--- tests/results/float_float_parse.py.exp	2025-07-05 08:58:24.481610409 +1000
+++ tests/results/float_float_parse.py.out	2025-07-05 08:58:24.481699777 +1000
@@ -8,9 +8,9 @@
 True
 True
 True
-1e-37
-1e-37
-1e-37
+9.999998e-38
+9.999998e-38
+9.999998e-38
 inf
 0.0
 inf

FAILURE tests/results/float_python36.py
--- tests/results/float_python36.py.exp	2025-07-05 08:58:33.011527395 +1000
+++ tests/results/float_python36.py.out	2025-07-05 08:58:33.012626615 +1000
@@ -1,5 +1,5 @@
-1000.18
+1000.17992
 1e+12
 123.0
-123.4
+123.399992
 1e+13

FAILURE tests/results/float_lexer.py
--- tests/results/float_lexer.py.exp	2025-07-05 08:58:26.874129703 +1000
+++ tests/results/float_lexer.py.out	2025-07-05 08:58:26.874202664 +1000
@@ -1 +1 @@
-0.1
+0.099999992

FAILURE tests/results/float_float_format_accuracy.py
--- tests/results/float_float_format_accuracy.py.exp	2025-07-05 08:58:21.583842985 +1000
+++ tests/results/float_float_format_accuracy.py.out	2025-07-05 08:58:21.583925345 +1000
@@ -1,2 +1,2 @@
 1200 values converted
-float format accuracy OK
+FAILED: repr rate=93.167% max_err=4.488e-07

FAILURE tests/results/float_float_format.py
--- tests/results/float_float_format.py.exp	2025-07-05 08:58:18.242580224 +1000
+++ tests/results/float_float_format.py.out	2025-07-05 08:58:18.242648482 +1000
@@ -116,13 +116,13 @@
 0.00006
 0.000060
 0.0000600
-10000000000000000000
-10000000000000000000.0
-10000000000000000000.00
-10000000000000000000.000
-10000000000000000000.0000
-10000000000000000000.00000
-10000000000000000000.000000
-10000000000000000000.0000000
+9999996800000000000
+9999996800000000000.0
+9999996800000000000.00
+9999996800000000000.000
+9999996800000000000.0000
+9999996800000000000.00000
+9999996800000000000.000000
+9999996800000000000.0000000
 1.00e+12
 1.00e+19

FAILURE tests/results/float_builtin_float_round.py
--- tests/results/float_builtin_float_round.py.exp	2025-07-05 08:58:12.110944351 +1000
+++ tests/results/float_builtin_float_round.py.out	2025-07-05 08:58:12.162301735 +1000
@@ -7,7 +7,7 @@
 -123
 -124
 1.23457
-1.2
+1.1999998
 1.0
 1200.0
 -2
@@ -24,6 +24,6 @@
 0.0
 1.0
 1.5
-1.47
+1.4699998
 <class 'OverflowError'>
 <class 'ValueError'>

FAILURE tests/results/float_float1.py
--- tests/results/float_float1.py.exp	2025-07-05 08:58:14.905048967 +1000
+++ tests/results/float_float1.py.out	2025-07-05 08:58:14.905123244 +1000
@@ -1,15 +1,15 @@
-0.12
+0.119999976
 1.0
-1.2
+1.1999998
 0.0
 0.0
 0.0
-1.2
-1.2
+1.1999998
+1.1999998
 1.0
 10.0
 10.0
-0.1
+0.099999992
 inf
 -inf
 inf
@@ -21,12 +21,12 @@
 ValueError
 ValueError
 ValueError
-1.2
-3.4
+1.1999998
+3.3999996
 False
 True
-1.2
--1.2
+1.1999998
+-1.1999998
 0.5
 0.5
 0.0

Although, with APPROX there are also 7 failures:

FAILURE tests/results/float_float_format_ints.py
--- tests/results/float_float_format_ints.py.exp	2025-07-05 09:00:28.728158119 +1000
+++ tests/results/float_float_format_ints.py.out	2025-07-05 09:00:28.728308144 +1000
@@ -480,4 +480,4 @@
 23456 x 10^4 with format {:.8g} gives 2.3456e+08
 16777215.000000
 2147483520.000000
-1.000000e+38
+9.999999e+37

FAILURE tests/results/float_float_parse.py
--- tests/results/float_float_parse.py.exp	2025-07-05 09:00:29.110447738 +1000
+++ tests/results/float_float_parse.py.out	2025-07-05 09:00:29.110559388 +1000
@@ -8,9 +8,9 @@
 True
 True
 True
-1e-37
-1e-37
-1e-37
+9.999998e-38
+9.999998e-38
+9.999998e-38
 inf
 0.0
 inf

FAILURE tests/results/float_python36.py
--- tests/results/float_python36.py.exp	2025-07-05 09:00:37.893613383 +1000
+++ tests/results/float_python36.py.out	2025-07-05 09:00:37.893859481 +1000
@@ -1,5 +1,5 @@
-1000.18
+1000.17992
 1e+12
 123.0
-123.4
+123.399992
 1e+13

FAILURE tests/results/float_lexer.py
--- tests/results/float_lexer.py.exp	2025-07-05 09:00:31.453041007 +1000
+++ tests/results/float_lexer.py.out	2025-07-05 09:00:31.453109019 +1000
@@ -1 +1 @@
-0.1
+0.099999992

FAILURE tests/results/float_float_format.py
--- tests/results/float_float_format.py.exp	2025-07-05 09:00:21.947870630 +1000
+++ tests/results/float_float_format.py.out	2025-07-05 09:00:21.948010941 +1000
@@ -116,13 +116,13 @@
 0.00006
 0.000060
 0.0000600
-10000000000000000000
-10000000000000000000.0
-10000000000000000000.00
-10000000000000000000.000
-10000000000000000000.0000
-10000000000000000000.00000
-10000000000000000000.000000
-10000000000000000000.0000000
+9999996160000000000
+9999996160000000000.0
+9999996160000000000.00
+9999996160000000000.000
+9999996160000000000.0000
+9999996160000000000.00000
+9999996160000000000.000000
+9999996160000000000.0000000
 1.00e+12
 1.00e+19

FAILURE tests/results/float_builtin_float_round.py
--- tests/results/float_builtin_float_round.py.exp	2025-07-05 09:00:15.756557259 +1000
+++ tests/results/float_builtin_float_round.py.out	2025-07-05 09:00:15.756698139 +1000
@@ -7,7 +7,7 @@
 -123
 -124
 1.23457
-1.2
+1.1999998
 1.0
 1200.0
 -2
@@ -24,6 +24,6 @@
 0.0
 1.0
 1.5
-1.47
+1.4699998
 <class 'OverflowError'>
 <class 'ValueError'>

FAILURE tests/results/float_float1.py
--- tests/results/float_float1.py.exp	2025-07-05 09:00:18.611995322 +1000
+++ tests/results/float_float1.py.out	2025-07-05 09:00:18.612098402 +1000
@@ -1,15 +1,15 @@
-0.12
+0.119999976
 1.0
-1.2
+1.1999998
 0.0
 0.0
 0.0
-1.2
-1.2
+1.1999998
+1.1999998
 1.0
 10.0
 10.0
-0.1
+0.099999992
 inf
 -inf
 inf
@@ -21,12 +21,12 @@
 ValueError
 ValueError
 ValueError
-1.2
-3.4
+1.1999998
+3.3999996
 False
 True
-1.2
--1.2
+1.1999998
+-1.1999998
 0.5
 0.5
 0.0

Hmm...

yoctopuce · 2025-07-05T09:16:56Z

I was able to reproduce the issue with a IMPL_FLOAT 32bit unix build.
The problem is not in the format code, but in the float parse code: with the latest code, in single-precision,
1.2
is parsed as
1.19999980926513671875
instead of
1.2000000476837158203125

The format code properly retrieves the exact 1.1999998 representation, which obviously breaks the test :-)

I will try to solve that in the coming days.

yoctopuce · 2025-07-05T10:00:37Z

I was able to reproduce the issue with a IMPL_FLOAT 32bit unix build. The problem is not in the format code, but in the float parse code

Erratum: the parse code works as intended. The problem is caused by REPR_C, which I had activated in my 32 bit test build.
With REPR_A, the APPROX implementation pass all tests, and the BASIC implementation only fails float_format_accuracy (makes sense...) and two minor rounding issues.

yoctopuce · 2025-07-05T10:07:25Z

I started a discussion about improving the accuracy of REPR_C (or adding a new non-boxed 32-bit REPR) to avoid this kind of problem. My idea was to allocate more bits to floats at the expense of a lower limit for smallints. However, the feedback was not enthusiastic. See https://github.com/orgs/micropython/discussions/17566

yoctopuce · 2025-07-05T10:40:47Z

To recap: the new formatting code works as intended: as it looks for the most accurate representation, it actually shows the "lost bits" of REPR_C (that would not have been shown with the old code).

However if the APPROX code would take into account the fact that REPR_C has two bits less, it would not push the precision that high. So I will update the validation checks in the formatting code to to take into account the lost bits when REPR_C is active, and stop refining earlier. That should solve the issue.

This commit extracts from the current float parsing code two functions which could be reused elsewhere in MicroPython. The code used to multiply a float x by a power of 10 is also simplified by applying the binary exponent separately from the power of 5. This avoids the risk of overflow in the intermediate stage, before multiplying by x. Signed-off-by: Yoctopuce dev <dev@yoctopuce.com>

yoctopuce · 2025-07-15T22:26:30Z

Quick update: still working on it, I am almost done (but it takes a bit longer as I am out of office these days...)

Following discussions in PR micropython#16666, this commit updates the float formatting code to improve the `repr` reversibility, i.e. the percentage of valid floating point numbers that do parse back to the same number when formatted by `repr`. This new code offers a choice of 3 float conversion methods, depending on the desired tradeoff between code size and conversion precision: - BASIC method is the smallest code footprint - APPROX method uses an iterative method to approximate the exact representation, which is a bit slower but but does not have a big impact on code size. It provides `repr` reversibility on >99.8% of the cases in double precision, and on >98.5% in single precision (except with REPR_C, where reversibility is 100% as the last two bits are not taken into account). - EXACT method uses higher-precision floats during conversion, which provides perfect results but has a higher impact on code size. It is faster than APPROX method, and faster than CPython equivalent implementation. It is however not available on all compilers when using FLOAT_IMPL_DOUBLE. Here is the table comparing the impact of the three conversion methods on code footprint on PYBV10 (using single-precision floats) and reversibility rate for both single-precision and double-precision floats. The table includes current situation as a baseline for the comparison: PYBV10 REPR_C FLOAT DOUBLE current = 364596 12.9% 27.6% 37.9% basic = 364712 85.6% 60.5% 85.7% approx = 364964 100.0% 98.5% 99.8% exact = 366408 100.0% 100.0% 100.0% Note that when using REPR_C, a few test cases do not pass due to the missing bits in the actual value, which are now properly reflected inthe result by the format function. Signed-off-by: Yoctopuce dev <dev@yoctopuce.com>

yoctopuce · 2025-07-16T21:19:18Z

I have pushed a new version which fixes most of the issues with REPR_C. The new code applies the optimal rounding method by adding mantissa digits progressively to all formatting operations where it could be beneficial, rather than just for repr. This provides a much nicer output for all numbers based on decimal exponents. I have been comparing the output of the following test code with CPython and MicroPython, using all float formats:

for mant in [34567, 76543, 999999]:
    for exp in range(-16, 16):
        strnum = "%de%d" % (mant, exp)
        print("Next number: %s" % strnum)
        num = float(strnum)
        for mode in ['e', 'f', 'g']:
            maxprec = 16
            # MicroPython has a length limit in objfloat.c
            if mode == 'f' and 6 + exp + maxprec > 31:
                maxprec = 31 - 6 - exp
            for prec in range(0, maxprec):
                fmt = "%." + str(prec) + mode
                print("%5s: " % fmt, fmt % num)

With the latest code, even REPR_C outperforms CPython in providing a better representation without "phantom" digits.

Here is the code size and format reversibility score, including current method as a baseline for comparison:

	PYBV10 code size	REPR_C score	FLOAT score	DOUBLE score
current	364596	12.9%	27.6%	37.9%
basic	364712	85.6%	60.5%	85.7%
approx	364964	100.0%	98.5%	99.8%
exact	366408	100.0%	100.0%	100.0%

I think there is still some room for improvement in this code, at least for one REPR_C rounding case, and to reduce the code size. But you can already give a try of the new code on esp8266 with REPR_BASIC.

dpgeorge · 2025-07-19T02:27:43Z

Testing the latest code here on esp8266, BASIC mode costs +208 bytes, APPROX mode costs +448 bytes.

With BASIC mode, I get failures on 2 tests:

FAILURE micropython/tests/results/float_float_format_ints.py                     
--- micropython/tests/results/float_float_format_ints.py.exp
+++ micropython/tests/results/float_float_format_ints.py.out
@@ -478,6 +478,6 @@
 23456 x 10^4 with format {:.8e} gives 2.34560000e+08
 23456 x 10^4 with format {:.8f} gives 234560000.00000000
 23456 x 10^4 with format {:.8g} gives 2.3456e+08
-16777215.000000
+16777214.400000
 2147483520.000000
 1.000000e+38

FAILURE micropython/tests/results/float_float_format_accuracy.py                     
--- micropython/tests/results/float_float_format_accuracy.py.exp
+++ micropython/tests/results/float_float_format_accuracy.py.out
@@ -1,2 +1,2 @@
 1200 values converted
-float format accuracy OK
+FAILED: repr rate=85.583% max_err=5.934e-07

With APPROX mode, I get failures on 1 test:

FAILURE micropython/tests/results/float_float_format_ints.py                     
--- micropython/tests/results/float_float_format_ints.py.exp
+++ micropython/tests/results/float_float_format_ints.py.out
@@ -478,6 +478,6 @@
 23456 x 10^4 with format {:.8e} gives 2.34560000e+08
 23456 x 10^4 with format {:.8f} gives 234560000.00000000
 23456 x 10^4 with format {:.8g} gives 2.3456e+08
-16777215.000000
-2147483520.000000
+16777212.000000
+2147483200.000000
 1.000000e+38

But everything else passes, which is great!

yoctopuce force-pushed the improve_formatfloat branch 2 times, most recently from 6efb905 to ef61c67 Compare June 6, 2025 18:11

yoctopuce force-pushed the improve_formatfloat branch 2 times, most recently from fb265b8 to 9652e45 Compare June 13, 2025 14:11

yoctopuce force-pushed the improve_formatfloat branch 3 times, most recently from aba60a3 to adcadc2 Compare June 13, 2025 14:50

dpgeorge added the py-core Relates to py/ directory in source label Jun 13, 2025

yoctopuce force-pushed the improve_formatfloat branch 2 times, most recently from 779f0e9 to a7c4061 Compare June 13, 2025 16:43

yoctopuce force-pushed the improve_formatfloat branch 2 times, most recently from 1e547ba to 1647452 Compare June 13, 2025 21:45

jepler reviewed Jun 14, 2025

View reviewed changes

yoctopuce force-pushed the improve_formatfloat branch 2 times, most recently from 535a56a to 153cbb5 Compare June 14, 2025 14:59

yoctopuce force-pushed the improve_formatfloat branch 2 times, most recently from 314df34 to 84a001f Compare June 17, 2025 11:07

yoctopuce force-pushed the improve_formatfloat branch from 86f9af4 to 11a8e87 Compare June 19, 2025 14:12

yoctopuce mentioned this pull request Jun 19, 2025

py/obj: Fix nan handling in REPR_C and REPR_D. #17531

Merged

yoctopuce force-pushed the improve_formatfloat branch from 11a8e87 to 86fbed2 Compare June 23, 2025 15:05

dpgeorge reviewed Jul 1, 2025

View reviewed changes

py/mpprint.c Outdated Show resolved Hide resolved

dpgeorge reviewed Jul 1, 2025

View reviewed changes

py/mpconfig.h Outdated Show resolved Hide resolved

dpgeorge reviewed Jul 1, 2025

View reviewed changes

py/mpconfig.h Outdated Show resolved Hide resolved

yoctopuce force-pushed the improve_formatfloat branch from 86fbed2 to 7034fad Compare July 3, 2025 06:32

dpgeorge reviewed Jul 4, 2025

View reviewed changes

py/misc.h Outdated Show resolved Hide resolved

py/mpconfig.h Outdated Show resolved Hide resolved

py/parsenum.c Show resolved Hide resolved

yoctopuce force-pushed the improve_formatfloat branch 2 times, most recently from 9deb316 to b878219 Compare July 4, 2025 22:27

yoctopuce force-pushed the improve_formatfloat branch 2 times, most recently from eed35f0 to ef3c624 Compare July 16, 2025 16:16

yoctopuce force-pushed the improve_formatfloat branch from ef3c624 to 85122f3 Compare July 16, 2025 18:52

Uh oh!

py/formatfloat: Improve accuracy of float formatting code. #17444

Are you sure you want to change the base?

py/formatfloat: Improve accuracy of float formatting code. #17444

Conversation

yoctopuce commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Trade-offs and Alternatives

Uh oh!

github-actions bot commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yoctopuce commented Jun 6, 2025

Uh oh!

dpgeorge commented Jun 8, 2025

Uh oh!

yoctopuce commented Jun 10, 2025

Uh oh!

yoctopuce commented Jun 10, 2025

Uh oh!

dpgeorge commented Jun 11, 2025

Uh oh!

dpgeorge commented Jun 12, 2025

Uh oh!

yoctopuce commented Jun 12, 2025

Uh oh!

yoctopuce commented Jun 13, 2025

Uh oh!

yoctopuce commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jepler Jun 14, 2025

Choose a reason for hiding this comment

Uh oh!

yoctopuce Jun 14, 2025

Choose a reason for hiding this comment

Uh oh!

yoctopuce commented Jun 14, 2025

Uh oh!

yoctopuce commented Jun 14, 2025

Uh oh!

yoctopuce commented Jun 19, 2025

Uh oh!

yoctopuce commented Jun 19, 2025

Uh oh!

yoctopuce commented Jun 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dpgeorge commented Jul 1, 2025

Uh oh!

yoctopuce commented Jul 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yoctopuce commented Jul 4, 2025

Uh oh!

dpgeorge commented Jul 4, 2025

Uh oh!

dpgeorge commented Jul 4, 2025

Uh oh!

yoctopuce commented Jul 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yoctopuce commented Jul 5, 2025

Uh oh!

yoctopuce commented Jul 5, 2025

Uh oh!

yoctopuce commented Jul 5, 2025

Uh oh!

yoctopuce commented Jul 15, 2025

Uh oh!

yoctopuce commented Jun 6, 2025 •

edited

Loading

github-actions bot commented Jun 6, 2025 •

edited

Loading

codecov bot commented Jun 6, 2025 •

edited

Loading

yoctopuce commented Jun 13, 2025 •

edited

Loading

yoctopuce commented Jul 5, 2025 •

edited

Loading

yoctopuce commented Jul 16, 2025 •

edited

Loading