-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Add ASAN support to the zend allocator #18858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@dstogov / @nielsdos / @iluuu1994 This is now ready for review! The implementation is finalized, and has actually revealed a shutdown memory leak in php-fpm when terminated via a SIGTERM, due to the fact that alloc_globals_dtor is not getting invoked (the leak is surprisingly revealed exclusively by the usage of the poison functions, in fact, altering the ZEND_MM_POISON/ZEND_MM_UNPOISON macros to be no-ops re-hides the issue again, i.e. it's not caused by the other code changes/reorganization). This PR also includes the changes from #18834, as the bug is detectable by ASAN now. Side note, tracked_malloc also poisons/unpoisons the heap to avoid issues with the observe proxies. |
@nielsdos what compiler, platform, configure flags and memory manager envvars are you using? |
It seems I was testing on an older checkout, I tested it again with a reset to your branch and it works, sorry for the confusion... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the remaining cs nit I don't see anything obviously wrong. Thanks for the clarifications. I think this patch is worth it. I'll leave this open for some time to see if others want to review this too.
Ping? :) |
cc @iluuu1994 ? |
@nielsdos I'll have a look in a few hours. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only approx. halfway through (line 1900 old diff). Overall this seems quite complex and easy to make mistakes. This can't detect buffer overflows for two adjacent allocations within the same bin. I'm also not sure whether this can catch more errors in the allocator itself, as pointers are explicitly unpoisoned before usage, so even if a pointer were wrong, it would just be unpoisoned.
But if others think this should be merged, we should have at least one ASAN build in nightly that doesn't set the USE_ZEND_ALLOC=0
env var. It would also be good to run a test COMMUNITY build to reveal any remaining errors.
} while (0); | ||
#endif | ||
|
||
p = (zend_mm_free_slot*)ptr; | ||
zend_mm_set_next_free_slot(heap, bin_num, p, heap->free_slot[bin_num]); | ||
heap->free_slot[bin_num] = p; | ||
|
||
ZEND_MM_POISON(p, bin_data_size[bin_num]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p
should already be poisoned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the header and shadow in zend_mm_set_next_free_slot, not the body
@@ -1586,6 +1812,8 @@ static zend_never_inline void *zend_mm_realloc_slow(zend_mm_heap *heap, void *pt | |||
size_t orig_peak = heap->peak; | |||
#endif | |||
ret = zend_mm_alloc_heap(heap, size ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC); | |||
ZEND_MM_UNPOISON(ret, size); | |||
ZEND_MM_UNPOISON(ptr, copy_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ptr
should be unpoisoned already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy_size is set to the size of the bin if size > old_size, size > size of the bin, old_size < size of the bin, so the unpoison is needed.
@@ -1627,6 +1855,7 @@ static zend_never_inline void *zend_mm_realloc_huge(zend_mm_heap *heap, void *pt | |||
#else | |||
zend_mm_change_huge_block_size(heap, ptr, new_size ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC); | |||
#endif | |||
ZEND_MM_POISON(ptr, new_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we poison here? Edit: I see, it's unpoisoned right away in zend_mm_realloc_heap
. Not sure if that's useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just follows the implementation rules
memcpy(ret, ptr, copy_size); | ||
zend_mm_free_small(heap, ptr, old_bin_num); | ||
} else { | ||
/* reallocation in-place */ | ||
ret = ptr; | ||
ZEND_MM_UNPOISON(ret, size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should already be unpoisoned. If anything, the upper bytes should be poisoned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Regarding this part: My fear is that, given most accesses unpoison/poison memory very locally, this reverts a lot of the benefits of poisoning. If an access is incorrect, it's very likely for the unpoisoning itself to also be incorrect, whether by depending on an incorrect parameter or applying incorrect logic to both unpoisoning and access. I will discuss this with Arnaud once he's back from vacation, so at some point next week. Thanks! |
The main class of errors this PR detects is use after frees; buffer overflows are also detected to some extent, but for better detection vanilla ASAN should be used. However, I have found in my experience that use after frees are the most common type of memory corruption in PHP, mainly due to refcounting bugs, and this PR helps detect those without using the more expensive and slow vanilla ASAN. |
It does in debug builds due to the debug info surrounding the allocation being poisoned (in addition to any alignment padding or regions past the requested size). I think this is similar in effect to ASAN's red zones, except the zones are fixed size in our case and are not guaranteed to exist before the first block in a bin, or after the last block. I think the main benefit of this change is not to detect bugs in the allocator itself, but to be able to spot memory management issues without disabling Zend MM. This enables us to run ASAN builds in conditions that are closer to those of non-ASAN builds. |
But is there a big incentive to compile with ASAN but not to use I won't object to this, but I am currently failing to see the benefit. In any case, this needs better CI coverage. And I will need to review the second half of this PR. 🙂 And as mentioned, it would be good to see a COMMUNITY build pass without Zend MM being enabled. |
This pull request adds ASAN support to the zend allocator, by automatically poisoning all unused pages, chunks and heap management structures before exiting the alloc, free, etc (all ZEND_API) functions.
Internally, the implementation uses the following rules:
This is what allowed me to find #18833, before I found the fast shutdown workaround.