Skip to content

gh-101525: Use only safe identical code folding with BOLT #134642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

geofft
Copy link
Contributor

@geofft geofft commented May 25, 2025

"Identical code folding" (ICF) is the feature of an optimizer to find that two functions have the same code and that they can therefore be deduplicated in the binary. While this is usually safe, it can cause observable behavior differences if the program relies on the fact that the two functions have different addresses.

CPython relies on this in (at least) Objects/typeobject.c, which defines two functions wrap_binaryfunc() and wrap_binaryfunc_l() with the same implementation, and stores their addresses in the slotdefs array. If these two functions have the same address, update_one_slot() in that file will fill in slots it shouldn't, causing, for instances, classes defined in Python that inherit from some built-in types to misbehave.

As of LLVM 20 (llvm/llvm-project#116275), BOLT has a "safe ICF" mode, where it looks to see if there are any uses of a function symbol outside function calls (e.g., relocations in data sections) and skips ICF on such functions. The intent is that this avoids observable behavior differences but still saves storage as much as possible.

This version is about two months old at the time of writing. To support older LLVM versions, we have to turn off ICF entirely.

This problem was previously noticed for Windows/MSVC in #53093 (and again in #24098), where the default behavior of PGO is to enable ICF (which they expand to "identical COMDAT folding") and we had to turn it off.

"Identical code folding" (ICF) is the feature of an optimizer to find that two
functions have the same code and that they can therefore be deduplicated
in the binary. While this is usually safe, it can cause observable
behavior differences if the program relies on the fact that the two
functions have different addresses.

CPython relies on this in (at least) Objects/typeobject.c, which defines
two functions wrap_binaryfunc() and wrap_binaryfunc_l() with the same
implementation, and stores their addresses in the slotdefs array. If
these two functions have the same address, update_one_slot() in that
file will fill in slots it shouldn't, causing, for instances,
classes defined in Python that inherit from some built-in types to
misbehave.

As of LLVM 20 (llvm/llvm-project#116275), BOLT has a "safe ICF" mode,
where it looks to see if there are any uses of a function symbol outside
function calls (e.g., relocations in data sections) and skips ICF on
such functions. The intent is that this avoids observable behavior
differences but still saves storage as much as possible.

This version is about two months old at the time of writing. To support
older LLVM versions, we have to turn off ICF entirely.

This problem was previously noticed for Windows/MSVC in python#53093 (and
again in python#24098), where the default behavior of PGO is to enable ICF
(which they expand to "identical COMDAT folding") and we had to turn it
off.
@geofft
Copy link
Contributor Author

geofft commented May 25, 2025

cc @zanieb
fyi @kmod who originally added -icf=1 to the BOLT flags

@corona10 corona10 self-assigned this May 25, 2025
@erlend-aasland erlend-aasland removed their request for review May 25, 2025 12:24
@corona10 corona10 changed the title Use only safe identical code folding with BOLT gh-101525: Use only safe identical code folding with BOLT May 26, 2025
[py_cv_bolt_icf_safe=no
${LLVM_BOLT} -icf=safe -o conftest.bolt conftest$EXEEXT >&AS_MESSAGE_LOG_FD 2>&1 dnl
&& py_cv_bolt_icf_safe=yes],
[AC_MSG_FAILURE([could not compile empty test program])])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[AC_MSG_FAILURE([could not compile empty test program])])
[AC_MSG_FAILURE([could not link empty test program with -icf=safe])])

It'd be good to be a bit more descriptive than "compilation failed".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The autoconf syntax here is a little confusing. This message isn't from the attempt to optimize with -icf=safe, it's from actually compiling the test program. BOLT is a post-link optimizer, so we give it a compiled and linked executable as an argument. (There is also ICF support in many linkers, to make things more confusing.) I'm using autoconf's function for running test-compiles for the side effect of creating a binary to pass to BOLT, but we need to handle the case where the compile, for whatever reason, fails. I don't expect anyone to hit this error because if you couldn't compile programs you probably couldn't get this far in ./configure anyway, but it's good to have something for the "can't happen" case.

The pseudo-Python equivalent of this check is

py_bolt_icf_flag = "-icf=safe"

@autoconf.cache_check(
    "py_cv_bolt_icf_safe",
    f"whether {LLVM_BOLT} supports safe identical code folding",
)
def check_bolt_icf_safe():
    saved_cflags, saved_ldflags = CFLAGS, LDFLAGS
    CFLAGS, LDFLAGS = CFLAGS_NODIST, LDFLAGS_NODIST
    if autoconf.test_link(autoconf.make_program("", "")):
        py_cv_bolt_icf_safe = False
        p = subprocess.run(
            [LLVM_BOLT, "-icf=safe" "-o", "conftest.bolt", f"conftest{EXEEXT}"],
            stdout=autoconf.message_log, stderr=autoconf.message_log,
        )
        if p.returncode == 0:
            py_cv_bolt_icf_safe = True
    else:
        raise RuntimeError("could not compile empty test program")
    CFLAGS, LDFLAGS = saved_cflags, saved_ldflags
    return py_cv_bolt_icf_safe

if not check_bolt_icf_safe():
    py_bolt_icf_flag = ""

It does occur to me, writing this out, that you could theoretically hit this failure case if your compiler and linker don't support the flags needed to produce a BOLT-able binary (Wl,--emit-relocs -fno-pie -no-pie), because this is the first time we're asking autoconf to use those flags in a test-compile. I have no idea why you would ask for BOLT if you don't have a vaguely compatible compiler, though, and you'd previously get a random compile error at some later point in the process, so I don't think this is making anything worse... but we could change the message to "could not compile and link test program with flags needed for BOLT" or something.

I would also accept the argument that I need wider indentation or some comments or something to make the autoconf readable.

geofft added a commit to astral-sh/python-build-standalone that referenced this pull request May 30, 2025
astral-sh/uv#13610 reported a misbehavior that is the result of a
subclass of str incorrectly having its ->tp_as_number->nb_add slot
filled in with the value of PyUnicode_Type->tp_as_sequence->sq_concat.
There are some times when this is an appropriate thing to do iwhen
subclassing, but this is not one of them. The logic to prevent it in
this case relies on two helper functions in the file, wrap_binaryfunc
and wrap_binaryfunc_l, having different addresses, even though they
contain identical code.

For some reason BOLT does not do this optimization in the shared library
(even though those are static functions and not exported), so we only
started seeing this in the static build.

BOLT in LLVM 20+ supports "safe" code folding, which uses heuristics about
relocations to determine whether a function's address is used in any way other
than a call. This seems to be enough to fix the issue. Add a patch to switch
to -icf=safe, submitted upstream as python/cpython#134642
@corona10 corona10 added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Jun 1, 2025
Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging this issue, Now BOLT build be more stable :)
LGTM.

I like this change and it's real issue: astral-sh/python-build-standalone#622
But I am not sure this is best autoconf that we add, I will wait @erlend-aasland 's review also.
Erlend, the change looks good. Can you review, focusing on the autoconf code itself?

Comment on lines +2138 to +2149
saved_cflags="$CFLAGS"
saved_ldflags="$LDFLAGS"
CFLAGS="$CFLAGS_NODIST"
LDFLAGS="$LDFLAGS_NODIST"
AC_LINK_IFELSE(
[AC_LANG_PROGRAM([[]], [[]])],
[py_cv_bolt_icf_safe=no
${LLVM_BOLT} -icf=safe -o conftest.bolt conftest$EXEEXT >&AS_MESSAGE_LOG_FD 2>&1 dnl
&& py_cv_bolt_icf_safe=yes],
[AC_MSG_FAILURE([could not compile empty test program])])
CFLAGS="$saved_cflags"
LDFLAGS="$saved_ldflags"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using our WITH_SAVE_ENV macro; it automatically saves and restores CFLAGS, CPPFLAGS, LDFLAGS, and LIBS:

Suggested change
saved_cflags="$CFLAGS"
saved_ldflags="$LDFLAGS"
CFLAGS="$CFLAGS_NODIST"
LDFLAGS="$LDFLAGS_NODIST"
AC_LINK_IFELSE(
[AC_LANG_PROGRAM([[]], [[]])],
[py_cv_bolt_icf_safe=no
${LLVM_BOLT} -icf=safe -o conftest.bolt conftest$EXEEXT >&AS_MESSAGE_LOG_FD 2>&1 dnl
&& py_cv_bolt_icf_safe=yes],
[AC_MSG_FAILURE([could not compile empty test program])])
CFLAGS="$saved_cflags"
LDFLAGS="$saved_ldflags"
WITH_SAVE_ENV([
CFLAGS="$CFLAGS_NODIST"
LDFLAGS="$LDFLAGS_NODIST"
AC_LINK_IFELSE(
[AC_LANG_PROGRAM([], [])],
[py_cv_bolt_icf_safe=no
${LLVM_BOLT} -icf=safe -o conftest.bolt conftest$EXEEXT >&AS_MESSAGE_LOG_FD 2>&1 dnl
&& py_cv_bolt_icf_safe=yes],
[AC_MSG_FAILURE([could not compile empty test program])])
])

Moreover, the double brackets ([[]]) are not needed; a single pair ([]) is sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy