[RFC] Add compile-time checking of mp_printf format strings #17556

jepler · 2025-06-24T08:34:10Z

Summary

It's always nice when errors can be detected at compile time. In traditional C programs, gcc can check that the printf argument types match the printf format string. This has not been possible up to now with mp_printf, because it has both extensions to standard printf (e.g., the %q format type) and is missing things in standard printf (e.g., %zd is not supported).

To that end, I have developed a GCC plugin that does this checking at compile time. I've also made the necessary changes for the unix coverage build to complete with the plugin enabled, and enabled it during the coverage build process.

Creating in draft mode to get feedback and also because this is cumulative with some other outstanding PRs that are needed to get the CI board to green.

Testing

I built the unix port coverage variant & ran the tests locally. The plugin itself should cause no code changes. There is a small code growth reported, so one of the added casts must not actually be a no-op. I have not determined which one.

Trade-offs and Alternatives

As a gcc plugin this can only support gcc-based toolchains. clang and proprietary compilers would not work. This does not seem important, as this feature only produces diagnostics.

The plugin is GPL licensed. I started with a GPL-licensed plugin template, and plugins need to be GPL-or-compatible in license in order to be loaded in gcc anyway. The plugin code IMO does not affect the license situation of the output object code, as you'd get the exact same code with or without the plugin.

Missing support for:

Knowing whether %ll is runtime-supported
handling enum argmuents properly (just one enum was %d printed in the coverage test and I added a cast instead of fixing the checker)
Whether to add & use defines similar to standard PRId32 for printing mp_{u,}int_t values: some ports need %d and others %ld, and maybe some even need %lld (I think maybe 32-bit nanbox builds would require this, for example). e.g., #define PRIdPY "lld" next to typedef long long mp_int_t. This would replace (int) casts which was the easiest way to get local builds to finish.

CI may need a new package installed -- debian needed gcc-12-plugin-dev and there's no gcc-plugin-dev (w/o version number) package to install the 'usual' one). I tried to code this, we'll see if it works.

Whether to enable it on more ports. This could catch problems in port-specific files, or for different fundamental object sizes.

Adding support for cmake-based builds and any other oddball build configurations

jepler · 2025-06-24T08:35:12Z

Example diagnostic:

coverage.c: In function ‘extra_coverage’:
coverage.c:206:9: error: argument 3: expected ‘long int’ or ‘long unsigned int’, not ‘int’ [-
Werror=format=]
  206 |         mp_printf(&mp_plat_print, "%ld\n", 123); // long
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

codecov · 2025-06-24T08:56:44Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.44%. Comparing base (17fbc5a) to head (4c2d376).

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #17556   +/-   ##
=======================================
  Coverage   98.44%   98.44%           
=======================================
  Files         171      171           
  Lines       22208    22209    +1     
=======================================
+ Hits        21863    21864    +1     
  Misses        345      345

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2025-06-24T08:58:27Z

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
     mimxrt:    +8 +0.002% TEENSY40
        rp2:    +0 +0.000% RPI_PICO_W
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +2 +0.000% VIRT_RV32

jepler · 2025-06-24T18:48:56Z

the plugin can also run during the gcc windows builds. I tested it locally using the cross-building steps but I assume it'd work on windows too. I won't complicate this PR by adding it, but I'd plan to add it in a subsequent PR. Some of the format string "findings" came out of me doing that locally.

jepler · 2025-06-25T07:33:50Z

@dpgeorge Please let me know if you think this is worth pursuing.

jepler · 2025-06-25T13:56:47Z

I did some checks in a sibling branch, and on macos, gcc is clang (!) which has a different incompatible plugin api; gcc-14 is also installed via brew in the runner, but it is missing some indirect dependency required for plugin building (/opt/homebrew/Cellar/gcc@14/14.3.0/bin/../lib/gcc/14/gcc/aarch64-apple-darwin23/14/plugin/include/system.h:700:10: fatal error: 'gmp.h' file not found). this may be solvable with brew, but as this would not cover any additional source files or integer size models it's probably not worth the time to figure out how to fix it.

jepler · 2025-06-25T15:40:09Z

Plugin support on Windows/MinGW has a number of limitations and additional requirements so adding support for that would better be postponed to a subsequent PR.

dpgeorge · 2025-07-04T07:10:35Z

Please let me know if you think this is worth pursuing.

I'm mildly in favour of this.

The three main things to consider would be:

The code being GPL licensed: no GPL code in this repo can be part of compiled firmware/executables. But the code added here is only part of the toolchain, which is OK. After all, gcc itself is GPL, and having this new plugin under GPL is the same as that, it has the same scope.
Increased complexity of the build process: before building any object files the plugin needs to be built. I guess this is fine, although it does add yet more rules to Make (and eventually CMake).
Being enabled by default: this will probably annoy some developers who would either need to install gcc-plugin-dev or disable it each time with DISABLE_PLUGIN=1. Luckily Arch Linux includes gcc-plugin-dev by default (for both gcc and arm-none-eabi-gcc) so that lowers the bar somewhat.

I did test out this PR locally with the unix coverage build and it works well.

It looks like this did find same cases of mp_printf that need to be fixed, so I guess that's a big reason to have it.

jepler · 2025-07-04T09:12:04Z

Thanks for the feedback.

I did consider trying to automatically enable the plugin if possible, and disable it if not; but this looked fragile.

One option would be to switch it to requring the plugin to be enabled (and enabling it during as many CI jobs as possible). A developer who runs into a diagnostic during CI would then have the option to install the plugin and use ENABLE locally, or whether to make a stab at fixing it and submit another job to CI.

Would it make more sense to work on fixing the diagnosed format problems in a separate PR (some of them do seem to be "real bugs" that will bite at runtime) and then bring the format checker in later? Or are you content to let the fixes and the checker land at the same time, which would make the fixes later?

Final question: C99 "solves" some format string problems by using macros like PRId64 that expand to a correct format string depending on the underlying types (e.g., it might expand to "lld" on an LLp64 sytem or "ld" on an LP64 system). What do you think of introducing such macros in micropython for mp_int_t/mp_uint_t? It looks like there's also a problem integer-printing pointer-width types which leads to the current Windows build failures (and which I'll correct)..

(Some issues are getting fixed in #17538 because the problems DID turn up during CI; this branch will need to be rebased when that one goes in)

dpgeorge · 2025-07-04T13:24:39Z

One option would be to switch it to requring the plugin to be enabled

Yes, I also thought about that.

At the very least, I think the option should be in the positive tense, eg GCC_ENABLE_MP_PRINTF_PLUGIN, and that could either be disabled by default, or enabled by default on selected builds (eg just unix coverage).

Would it make more sense to work on fixing the diagnosed format problems in a separate PR

Yes, please. That PR would be a much easier thing to review and merge.

Final question: C99 "solves" some format string problems by using macros like PRId64
...
What do you think of introducing such macros in micropython for mp_int_t/mp_uint_t?

Yes, I think that's a good idea.

I have many times considered using the PRIxxx macros for all printf strings. But it's a fair bit of work to do that. But a good idea to start with them for mp_int_t/mp_uint_t.

jepler · 2025-07-04T13:40:40Z

My thoughts on sequencing the work:

Land Coverage test sys.settrace & improve coverage #17538 because it has some initial format-string fixes
Cherry pick just the fixes from this PR into a new branch and get it to pass CI.
Introduce PRI-macros if it helps. Internally I'll use the format checker during this step.
Finally, once that new format string fix PR is in, rebase & return to this PR.

jepler · 2025-07-05T16:48:59Z

huh. two wrongs make a "fascinating". I didn't expect to see code size savings. Is it coming from 3f9b9c4, which I had actually intended to rebase out? Let's find out over at #17618

jepler · 2025-07-07T09:19:05Z

More sequencing: Once #17618 goes in, I'll rebase this, add XINT_FMT (the format string to print mp_int_t as hex), and use it for cell printing (@jepler objcell: Fix printing cell ID.). Once that's done and all green with the checker in this PR, I'll create a separate "fixes only" PR.

jepler · 2025-07-17T17:17:30Z

Rebased. I'll get it back to green, then submit a 2nd PR with "just the bug fixes".

The default definition in py/mpconfig.h is %u/%d, so these can be removed. Signed-off-by: Jeff Epler <jepler@gmail.com>

Signed-off-by: Jeff Epler <jepler@gmail.com>

The name field of type objects is of type uint16_t for efficiency, but when the type is passed to mp_printf it must be cast explicitly to type qstr. These locations were found using an experimental gcc plugin for mp_printf error checking, cross-building for x64 windows on Linux. Signed-off-by: Jeff Epler <jepler@gmail.com>

The type of the argument must match the format string. Add casts to ensure that they do. It's possible that casting from `size_t` to `unsigned` loses the correct values by masking off upper bits, but it seems likely that the quantities involved in practice are small enough that the %u formatter (32 bits on most platforms, 16 on pic16bit) will in fact hold the correct value. The alternative, casting to a wider type, adds code size. These locations were found using an experimental gcc plugin for mp_printf error checking, cross-building for x64 windows on Linux. In one case there was already a cast, but it was written incorrectly and did not have the intended effect. Signed-off-by: Jeff Epler <jepler@gmail.com>

we still want this not to crash a runtime but the new static checker wouldn't like it. Signed-off-by: Jeff Epler <jepler@gmail.com>

Signed-off-by: Jeff Epler <jepler@gmail.com>

This fixes the following diagnostic produced by the plugin: ``` error: argument 3: Format ‘%x’ requires a ‘int’ or ‘unsigned int’ (32 bits), not ‘long unsigned int’ [size 64] [-Werror=format=] ``` Signed-off-by: Jeff Epler <jepler@gmail.com>

During the coverage test, all the values encountered are within the range of %d. These locations were found using an experimental gcc plugin for mp_printf error checking. Signed-off-by: Jeff Epler <jepler@gmail.com>

Signed-off-by: Jeff Epler <jepler@gmail.com>

These locations were found using an experimental gcc plugin for mp_printf error checking. Signed-off-by: Jeff Epler <jepler@gmail.com>

As timeout is of type `mp_int_t`, it must be printed with INT_FMT. Before, the compiler plugin produced an error in the PYBD_SF6 build, which is a nanboxing build with 64-bit ints. Signed-off-by: Jeff Epler <jepler@gmail.com>

Before, the compiler plugin produced an error in the PYBD_SF6 build, which is a nanboxing build with 64-bit ints. I made the decision here to cast the value even though some significant bits might be lost after 49.7 days. However, the format used is "% 8d", which produces a consistent width output for small ticks values (up to about 1.1 days). I judged that it was more valuable to preserve the fixed width display than to accurately represent long time periods. Signed-off-by: Jeff Epler <jepler@gmail.com>

On the nanbox build, `o->obj` is a 64-bit type but `%p` formats a 32-bit type, leading to undefined behavior. Print the cell's ID as a hex integer instead. This location was found using an experimental gcc plugin for mp_printf error checking. Signed-off-by: Jeff Epler <jepler@gmail.com>

All these arguments are of type `mp_{u,}int_t`, but the actual value is always a small integer. Cast it so that it can format with the %d/%u formatter. Before, the compiler plugin produced an error in the PYBD_SF6 build, which is a nanboxing build with 64-bit ints. Signed-off-by: Jeff Epler <jepler@gmail.com>

On a build like nanbox, mp_uint_t is wider than u/intptr_t. Using a signed type for fetching pointer values resulted in erroneous results: like `<function f at 0xfffffffff7a60bc0>` instead of `<function f at 0xf7a60bc0>`. Signed-off-by: Jeff Epler <jepler@gmail.com>

Signed-off-by: Jeff Epler <jepler@gmail.com>

.. so filter it out, similar to stm32. Signed-off-by: Jeff Epler <jepler@gmail.com>

It causes an error, so filter it out, similar to other compile-only flags. Signed-off-by: Jeff Epler <jepler@gmail.com>

Signed-off-by: Jeff Epler <jepler@gmail.com>

jepler force-pushed the compile-time-format-checker branch 2 times, most recently from df0062d to c39e2bd Compare June 24, 2025 08:49

jepler force-pushed the compile-time-format-checker branch 3 times, most recently from 0153502 to 536f6fb Compare June 24, 2025 16:48

jepler mentioned this pull request Jun 30, 2025

mpprint: Add %R (print object) & %K (print exception). #17583

Closed

dpgeorge added the py-core Relates to py/ directory in source label Jul 4, 2025

jepler force-pushed the compile-time-format-checker branch 3 times, most recently from 990966c to f0a7d80 Compare July 5, 2025 15:32

jepler mentioned this pull request Jul 5, 2025

mpprint: Rework integer vararg handling. #17618

Closed

jepler force-pushed the compile-time-format-checker branch 3 times, most recently from 1ecc7a8 to a171072 Compare July 6, 2025 07:56

jepler force-pushed the compile-time-format-checker branch 3 times, most recently from 02ff49b to 7424d62 Compare July 17, 2025 17:16

jepler force-pushed the compile-time-format-checker branch 2 times, most recently from 2ba0e92 to 7927022 Compare July 17, 2025 18:14

jepler marked this pull request as ready for review July 17, 2025 19:43

jepler mentioned this pull request Jul 17, 2025

Fix mpprintf argument type errors #17704

Open

jepler force-pushed the compile-time-format-checker branch from 7927022 to 83f62ef Compare July 18, 2025 11:12

jepler added 20 commits July 18, 2025 06:33

ports: Eliminate {U,}INT_FMT where redundant.

d4c95c7

The default definition in py/mpconfig.h is %u/%d, so these can be removed. Signed-off-by: Jeff Epler <jepler@gmail.com>

various: Define HEX_FMT.

2065abb

Signed-off-by: Jeff Epler <jepler@gmail.com>

coverage: Avoid type checking an invalid string.

6a10fc0

we still want this not to crash a runtime but the new static checker wouldn't like it. Signed-off-by: Jeff Epler <jepler@gmail.com>

coverage: Cast type names to qstr explicitly.

a2d47c2

Signed-off-by: Jeff Epler <jepler@gmail.com>

coverage: Cast values to int for printing.

c5aa849

During the coverage test, all the values encountered are within the range of %d. These locations were found using an experimental gcc plugin for mp_printf error checking. Signed-off-by: Jeff Epler <jepler@gmail.com>

coverage: Provide argmuents of expected types.

bc18cd7

Signed-off-by: Jeff Epler <jepler@gmail.com>

coverage: Remove unused printf args.

be6b5f6

Signed-off-by: Jeff Epler <jepler@gmail.com>

examplemodule: Cast arguments to printf.

bed74b7

These locations were found using an experimental gcc plugin for mp_printf error checking. Signed-off-by: Jeff Epler <jepler@gmail.com>

modlwip: Print timeout with correct format string.

6f7299e

As timeout is of type `mp_int_t`, it must be printed with INT_FMT. Before, the compiler plugin produced an error in the PYBD_SF6 build, which is a nanboxing build with 64-bit ints. Signed-off-by: Jeff Epler <jepler@gmail.com>

micropython_checks: Add compiler plugin.

a6b1593

Signed-off-by: Jeff Epler <jepler@gmail.com>

nrf: Can't list format plugin on linker commandline.

9f8b7f0

.. so filter it out, similar to stm32. Signed-off-by: Jeff Epler <jepler@gmail.com>

stm32: Don't list format plugin on linker commandline.

73b7d8b

It causes an error, so filter it out, similar to other compile-only flags. Signed-off-by: Jeff Epler <jepler@gmail.com>

ci: Enable format checking in many builds.

4c2d376

Signed-off-by: Jeff Epler <jepler@gmail.com>

jepler force-pushed the compile-time-format-checker branch from 83f62ef to 4c2d376 Compare July 18, 2025 11:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[RFC] Add compile-time checking of mp_printf format strings #17556

[RFC] Add compile-time checking of mp_printf format strings #17556

jepler commented Jun 24, 2025 •

edited

Loading

Uh oh!

jepler commented Jun 24, 2025

Uh oh!

codecov bot commented Jun 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 24, 2025 •

edited

Loading

Uh oh!

jepler commented Jun 24, 2025

Uh oh!

jepler commented Jun 25, 2025

Uh oh!

jepler commented Jun 25, 2025

Uh oh!

jepler commented Jun 25, 2025

Uh oh!

dpgeorge commented Jul 4, 2025

Uh oh!

jepler commented Jul 4, 2025

Uh oh!

dpgeorge commented Jul 4, 2025

Uh oh!

jepler commented Jul 4, 2025

Uh oh!

jepler commented Jul 5, 2025

Uh oh!

jepler commented Jul 7, 2025 •

edited

Loading

Uh oh!

jepler commented Jul 17, 2025

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Uh oh!

[RFC] Add compile-time checking of mp_printf format strings #17556

Are you sure you want to change the base?

[RFC] Add compile-time checking of mp_printf format strings #17556

Conversation

jepler commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Trade-offs and Alternatives

Uh oh!

jepler commented Jun 24, 2025

Uh oh!

codecov bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jepler commented Jun 24, 2025

Uh oh!

jepler commented Jun 25, 2025

Uh oh!

jepler commented Jun 25, 2025

Uh oh!

jepler commented Jun 25, 2025

Uh oh!

dpgeorge commented Jul 4, 2025

Uh oh!

jepler commented Jul 4, 2025

Uh oh!

dpgeorge commented Jul 4, 2025

Uh oh!

jepler commented Jul 4, 2025

Uh oh!

jepler commented Jul 5, 2025

Uh oh!

jepler commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jepler commented Jul 17, 2025

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

jepler commented Jun 24, 2025 •

edited

Loading

codecov bot commented Jun 24, 2025 •

edited

Loading

github-actions bot commented Jun 24, 2025 •

edited

Loading

jepler commented Jul 7, 2025 •

edited

Loading