Skip to content

Smarter use of a mutex in incremental HMAC and hash functions #135239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
picnixz opened this issue Jun 7, 2025 · 1 comment
Open

Smarter use of a mutex in incremental HMAC and hash functions #135239

picnixz opened this issue Jun 7, 2025 · 1 comment
Assignees
Labels
extension-modules C modules in the Modules dir performance Performance or resource usage type-feature A feature request or enhancement

Comments

@picnixz
Copy link
Member

picnixz commented Jun 7, 2025

Feature or enhancement

Proposal:

Currently, when doing incremental HMAC, if a message is sufficiently large, we lock the HMAC object but release the GIL so that we can call HMAC_Update without the GIL. However, for subsequent data, we will always do this. I'll need to estimate if multiple calls to HMAC, first with a very big data, then only with very small data, will be impacted. Usually, update() is called because we're in a chunk-based strategy. So, we could expect the next block of data to be as large as the one we're already hashing.

Nevertheless, I think we should turn off the mutex usage if we encounter a small block of data as we could have started with a very big chunk, then with very small ones (at least below the hardcoded limit).

I'll do some benchmarks tomorrow but I wanted to create an issue for that.

Note: a similar argument holds for hash functions and their update().

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

@picnixz picnixz self-assigned this Jun 7, 2025
@picnixz picnixz added type-feature A feature request or enhancement extension-modules C modules in the Modules dir labels Jun 7, 2025
@picnixz picnixz changed the title Smarter use of a mutex in incremental HMAC Smarter use of a mutex in incremental HMAC and hash functions Jun 8, 2025
@picnixz
Copy link
Member Author

picnixz commented Jun 8, 2025

Ok, my conclusions are that... it doesn't matter, except for blake2b where it's slightly slower to release/hold the GIL. But that's with an idle system. Maybe with a non-idle one it will be a bit worse. I'll make the adjustments won't be much slower:

    if (!self->use_mutex && buf.len >= HASHLIB_GIL_MINSIZE) {
        self->use_mutex = true;
    }
    if (self->use_mutex) {
        Py_BEGIN_ALLOW_THREADS
        PyMutex_Lock(&self->mutex);
        update(self->hash_state, buf.buf, buf.len);
        PyMutex_Unlock(&self->mutex);
        Py_END_ALLOW_THREADS
    } else {
        update(self->hash_state, buf.buf, buf.len);
    }

Once we have self->use_mutex = true, the condition !self->use_mutex && buf.len >= HASHLIB_GIL_MINSIZE will always be false even if buf.len >= HASHLIB_GIL_MINSIZE is true. If we switch to a small data regime, then we'll likely do un-necessary PyMutex_Lock/PyMutex_Unlock calls, whereas we could just do the update at once. Having two comparisons + 1 LOAD + 1 STORE (to set self->use_mutex = false afterwards) will likely be faster than saving 1 comparison but keeping calls to PyMutex_*.

I'll draft a PR to check if it's worth it or not, as it can also be considered a small refactoring.

@picnixz picnixz added the performance Performance or resource usage label Jun 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir performance Performance or resource usage type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

1 participant
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy