Skip to content

Add ASAN support to the zend allocator #18858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open

Conversation

danog
Copy link
Contributor

@danog danog commented Jun 16, 2025

This pull request adds ASAN support to the zend allocator, by automatically poisoning all unused pages, chunks and heap management structures before exiting the alloc, free, etc (all ZEND_API) functions.

Internally, the implementation uses the following rules:

  • Always poison memory (re)allocated by private (non-ZEND_API) allocation functions before returning it
  • Unpoison memory (re)allocated by public (ZEND_API) allocation functions before returning within the ZEND_API function
  • Always poison freed memory
  • Always poison unused memory during reallocation (where new_size < old_size)
  • When accessing private heap structures and fields, always unpoison before accessing and repoison immediately after
    • An exception to the above (for simplicity) is the main heap datastructure, which is poisoned only when entering a ZEND_API function and repoisoned before exiting

This is what allowed me to find #18833, before I found the fast shutdown workaround.

@danog danog force-pushed the asan_zend_alloc branch from 197f4a3 to 2f84241 Compare June 16, 2025 12:14
@danog danog force-pushed the asan_zend_alloc branch from 488d33d to e88ec6d Compare June 17, 2025 12:25
@danog danog marked this pull request as ready for review June 17, 2025 13:06
@danog danog requested review from dstogov and bukka as code owners June 17, 2025 13:06
@danog
Copy link
Contributor Author

danog commented Jun 17, 2025

@dstogov / @nielsdos / @iluuu1994 This is now ready for review!

The implementation is finalized, and has actually revealed a shutdown memory leak in php-fpm when terminated via a SIGTERM, due to the fact that alloc_globals_dtor is not getting invoked (the leak is surprisingly revealed exclusively by the usage of the poison functions, in fact, altering the ZEND_MM_POISON/ZEND_MM_UNPOISON macros to be no-ops re-hides the issue again, i.e. it's not caused by the other code changes/reorganization).

This PR also includes the changes from #18834, as the bug is detectable by ASAN now.

Side note, tracked_malloc also poisons/unpoisons the heap to avoid issues with the observe proxies.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When trying to run ./run-tests.php with this I get an immediate ASAN crash that should be fixed:

==104448==ERROR: AddressSanitizer: use-after-poison on address 0x7b96b5c01080 at pc 0x560e145d67d4 bp 0x7ffd7eb60800 sp 0x7ffd7eb607f8
WRITE of size 8 at 0x7b96b5c01080 thread T0
    #0 0x560e145d67d3 in zend_mm_alloc_small_slow /run/media/niels/MoreData/php-src/Zend/zend_alloc.c:1507:14
    #1 0x560e145d4f8e in zend_mm_alloc_small /run/media/niels/MoreData/php-src/Zend/zend_alloc.c:1546:10
    #2 0x560e145c629b in zend_mm_alloc_heap /run/media/niels/MoreData/php-src/Zend/zend_alloc.c:1625:9
    #3 0x560e145cb2a4 in _emalloc /run/media/niels/MoreData/php-src/Zend/zend_alloc.c:3034:14
    #4 0x560e14c45e82 in cwd_globals_ctor /run/media/niels/MoreData/php-src/Zend/zend_virtual_cwd.c:145:2
    #5 0x560e14c45dce in virtual_cwd_startup /run/media/niels/MoreData/php-src/Zend/zend_virtual_cwd.c:211:2
    #6 0x560e14c5ec09 in zend_startup /run/media/niels/MoreData/php-src/Zend/zend.c:942:2
    #7 0x560e14320b8d in php_module_startup /run/media/niels/MoreData/php-src/main/main.c:2200:2
    #8 0x560e14c733b4 in php_cli_startup /run/media/niels/MoreData/php-src/sapi/cli/php_cli.c:398:9
    #9 0x560e14c6e243 in main /run/media/niels/MoreData/php-src/sapi/cli/php_cli.c:1330:6
    #10 0x7f96bb8906b4  (/usr/lib/libc.so.6+0x276b4) (BuildId: 468e3585c794491a48ea75fceb9e4d6b1464fc35)
    #11 0x7f96bb890768 in __libc_start_main (/usr/lib/libc.so.6+0x27768) (BuildId: 468e3585c794491a48ea75fceb9e4d6b1464fc35)
    #12 0x560e13003584 in _start (/run/media/niels/MoreData/php-src/sapi/cli/php+0xa03584) (BuildId: 04a8f40ab178fb1528246ccec118a5d2e939fd1b)

@danog
Copy link
Contributor Author

danog commented Jul 7, 2025

@nielsdos what compiler, platform, configure flags and memory manager envvars are you using?

@nielsdos
Copy link
Member

nielsdos commented Jul 7, 2025

what compiler, platform, configure flags and memory manager envvars are you using?

It seems I was testing on an older checkout, I tested it again with a reset to your branch and it works, sorry for the confusion...

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the remaining cs nit I don't see anything obviously wrong. Thanks for the clarifications. I think this patch is worth it. I'll leave this open for some time to see if others want to review this too.

@danog
Copy link
Contributor Author

danog commented Jul 16, 2025

Ping? :)

@nielsdos
Copy link
Member

cc @iluuu1994 ?

@iluuu1994
Copy link
Member

@nielsdos I'll have a look in a few hours.

Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only approx. halfway through (line 1900 old diff). Overall this seems quite complex and easy to make mistakes. This can't detect buffer overflows for two adjacent allocations within the same bin. I'm also not sure whether this can catch more errors in the allocator itself, as pointers are explicitly unpoisoned before usage, so even if a pointer were wrong, it would just be unpoisoned.

But if others think this should be merged, we should have at least one ASAN build in nightly that doesn't set the USE_ZEND_ALLOC=0 env var. It would also be good to run a test COMMUNITY build to reveal any remaining errors.

} while (0);
#endif

p = (zend_mm_free_slot*)ptr;
zend_mm_set_next_free_slot(heap, bin_num, p, heap->free_slot[bin_num]);
heap->free_slot[bin_num] = p;

ZEND_MM_POISON(p, bin_data_size[bin_num]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p should already be poisoned.

Copy link
Contributor Author

@danog danog Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the header and shadow in zend_mm_set_next_free_slot, not the body

@@ -1586,6 +1812,8 @@ static zend_never_inline void *zend_mm_realloc_slow(zend_mm_heap *heap, void *pt
size_t orig_peak = heap->peak;
#endif
ret = zend_mm_alloc_heap(heap, size ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC);
ZEND_MM_UNPOISON(ret, size);
ZEND_MM_UNPOISON(ptr, copy_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptr should be unpoisoned already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy_size is set to the size of the bin if size > old_size, size > size of the bin, old_size < size of the bin, so the unpoison is needed.

@@ -1627,6 +1855,7 @@ static zend_never_inline void *zend_mm_realloc_huge(zend_mm_heap *heap, void *pt
#else
zend_mm_change_huge_block_size(heap, ptr, new_size ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC);
#endif
ZEND_MM_POISON(ptr, new_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we poison here? Edit: I see, it's unpoisoned right away in zend_mm_realloc_heap. Not sure if that's useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just follows the implementation rules

memcpy(ret, ptr, copy_size);
zend_mm_free_small(heap, ptr, old_bin_num);
} else {
/* reallocation in-place */
ret = ptr;
ZEND_MM_UNPOISON(ret, size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should already be unpoisoned. If anything, the upper bytes should be poisoned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@iluuu1994
Copy link
Member

I'm also not sure whether this can catch more errors in the allocator itself, as pointers are explicitly unpoisoned before usage, so even if a pointer were wrong, it would just be unpoisoned.

Regarding this part: My fear is that, given most accesses unpoison/poison memory very locally, this reverts a lot of the benefits of poisoning. If an access is incorrect, it's very likely for the unpoisoning itself to also be incorrect, whether by depending on an incorrect parameter or applying incorrect logic to both unpoisoning and access. I will discuss this with Arnaud once he's back from vacation, so at some point next week.

Thanks!

@danog
Copy link
Contributor Author

danog commented Jul 23, 2025

Regarding this part: My fear is that, given most accesses unpoison/poison memory very locally, this reverts a lot of the benefits of poisoning. If an access is incorrect, it's very likely for the unpoisoning itself to also be incorrect, whether by depending on an incorrect parameter or applying incorrect logic to both unpoisoning and access. I will discuss this with Arnaud once he's back from vacation, so at some point next week.

The main class of errors this PR detects is use after frees; buffer overflows are also detected to some extent, but for better detection vanilla ASAN should be used.

However, I have found in my experience that use after frees are the most common type of memory corruption in PHP, mainly due to refcounting bugs, and this PR helps detect those without using the more expensive and slow vanilla ASAN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy