Skip to content

Use CSPRNG for random UUID node ID #135244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
LamentXU123 opened this issue Jun 8, 2025 · 15 comments
Open

Use CSPRNG for random UUID node ID #135244

LamentXU123 opened this issue Jun 8, 2025 · 15 comments
Assignees
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes deferred-blocker stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@LamentXU123
Copy link
Contributor

LamentXU123 commented Jun 8, 2025

Feature or enhancement

Proposal:

Issue for: #135226
(I forgot to open issue first, sry)
I'll do the benchmark of the performance impact when switching to secrets.getrandbits() for the clock sequence soon. I'll post here.
I totally agree with the previous discussion. Summary:

  • Maybe make it more clearer in the docs of UUIDv8(?) Its possible to predict UUIDv8 if hackers have collected enough UUIDs. BUT, UUIDv8 is highly customizable. It's pretty ez if someone wants to use CSPRNG in UUIDv8s by writing a wrapper outside the call(or using UUIDv4 instead lol).
  • Change _random_getnode() from PRNG to CSPRNG which randomly generates 48-bits note when python failed to get MAC address
def _random_getnode():
+	import secrets
+	return secrets.randbits(48) | (1 << 40)
-	import random
-	return random.getranbits(48) | (1 << 40)
  • Change the logic when generating clock_seq from PRNG to CSPRNG in both uuidv1() and uuidv6(). Well I think this is of great importance. Those two algorithm are time-based. The only random thing comes from clock_seq. Hackers can predict the next UUID if they got 624*32 random bits generated from the previous UUIDs (1427 UUIDs will be enough).
def uuid1(node=None, clock_seq=None):
    ...
    if clock_seq is None:
+       import secrets
+       clock_seq = secrets.randbits(14) # instead of stable storage
-		import random
-		clock_seq = random.getrandbits(14) # instead of stable storage
    ...

def uuid6(node=None, clock_seq=None):
    ...
+       import secrets
+       clock_seq = secrets.randbits(14) # instead of stable storage
-		import random
-		clock_seq = random.getrandbits(14) # instead of stable storage
    ...

note that nothing will be changed if the performance impact is unacceptable.

Has this already been discussed elsewhere?

I have already discussed this feature proposal on Discourse

Links to previous discussion of this feature:

#135226

Linked PRs

@LamentXU123 LamentXU123 added the type-feature A feature request or enhancement label Jun 8, 2025
@LamentXU123
Copy link
Contributor Author

LamentXU123 commented Jun 8, 2025

Another CSPRNG( os.urandom() + cutoff + int.from_bytes()) is considered if its faster than secrets.randbits()

I've run the brench mark wieh pyperf. It seems like os.urandom() might be faster.

    from os import urandom
    return int.from_bytes(urandom(2), 'big') & 0x3fff

so to sum up, may we use os.urandom() instead of random.getrandbits() in clock_seq and notes generations, and make clearer security clarifications in doc of UUIDv8? @picnixz

@picnixz picnixz added the stdlib Python modules in the Lib dir label Jun 8, 2025
@picnixz
Copy link
Member

picnixz commented Jun 8, 2025

Hackers can predict the next UUID if they got 624*32 random bits generated from the previous UUIDs (1427 UUIDs will be enough).

To be precise, I should have said that this is assuming that the UUIDs are generated sequentially and that the underlying MT-19937 state is not touched in between, so it's a bit more complex.

@picnixz
Copy link
Member

picnixz commented Jun 8, 2025

so to sum up, may we use os.urandom() instead of random.getrandbits() in clock_seq and notes generations

I think I saw the benchmarks results somewhere. Can you post them in full as well? (I think you edited your message).

@picnixz picnixz changed the title Get random in UUID more cryptography secure Use CSPRNG for random UUID node ID & clock sequence #135226 Jun 8, 2025
@picnixz picnixz changed the title Use CSPRNG for random UUID node ID & clock sequence #135226 Use CSPRNG for random UUID node ID & clock sequenc Jun 8, 2025
@picnixz picnixz changed the title Use CSPRNG for random UUID node ID & clock sequenc Use CSPRNG for random UUID node ID & clock sequence Jun 8, 2025
@picnixz picnixz added type-bug An unexpected behavior, bug, or error 3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes and removed type-feature A feature request or enhancement labels Jun 8, 2025
@picnixz
Copy link
Member

picnixz commented Jun 8, 2025

I'll treat this as a bugfix but not a security bug fix as it's not critical enough IMO.

@LamentXU123
Copy link
Contributor Author

LamentXU123 commented Jun 8, 2025

so to sum up, may we use os.urandom() instead of random.getrandbits() in clock_seq and notes generations

I think I saw the benchmarks results somewhere. Can you post them in full as well? (I think you edited your message).

os.urandom(): Mean +- std dev: 205 ns +- 6 ns
random.getrandbits(): Mean +- std dev: 105 ns +- 5 ns
secrets.randbits(): Mean +- std dev: 17.6 us +- 2.3 us

pyperf on python 3.9

benchmark_UUID.json

@LamentXU123
Copy link
Contributor Author

LamentXU123 commented Jun 8, 2025

Hackers can predict the next UUID if they got 624*32 random bits generated from the previous UUIDs (1427 UUIDs will be enough).

To be precise, I should have said that this is assuming that the UUIDs are generated sequentially and that the underlying MT-19937 state is not touched in between, so it's a bit more complex.

You're right. And there are also secruity warnings of uuid1() and uuid6() in docs. So its just not a serious problem.

@picnixz
Copy link
Member

picnixz commented Jun 8, 2025

pyperf on python 3.9

Can you test using the latest main branch instead of Python 3.9? TiA

@picnixz
Copy link
Member

picnixz commented Jun 8, 2025

I've done the benchmarks on my side and they are different:

$ ./python -m pyperf timeit -s "import random" "random.getrandbits(14)"
Mean +- std dev: 14.0 ns +- 0.3 ns

$ ./python -m pyperf timeit -s "import os" "int.from_bytes(os.urandom(2)) & 0x3fff"
Mean +- std dev: 555 ns +- 5 ns

$ ./python -m pyperf timeit -s "import secrets" "secrets.randbits(14)"
Mean +- std dev: 599 ns +- 5 ns

So we're still losing quite a lot of speed.

@picnixz
Copy link
Member

picnixz commented Jun 8, 2025

This is a very huge drop of performance. We're 25x times slower now. Since the node ID is cached, I think we can live with a CSPRNG. However resampling the clock sequence every time is too much of a performance loss, so we won't use it. However, we can add a note.

@picnixz picnixz changed the title Use CSPRNG for random UUID node ID & clock sequence Use CSPRNG for random UUID node ID Jun 8, 2025
@LamentXU123
Copy link
Contributor Author

I've done the benchmarks on my side and they are different:

$ ./python -m pyperf timeit -s "import random" "random.getrandbits(14)"
Mean +- std dev: 14.0 ns +- 0.3 ns

$ ./python -m pyperf timeit -s "import os" "int.from_bytes(os.urandom(2)) & 0x3fff"
Mean +- std dev: 555 ns +- 5 ns

$ ./python -m pyperf timeit -s "import secrets" "secrets.randbits(14)"
Mean +- std dev: 599 ns +- 5 ns
So we're still losing quite a lot of speed.

My benchmark on 3.13:

os.urandom(): Mean +- std dev: 183 ns +- 7 ns
random.getrandbits(): Mean +- std dev: 75.0 ns +- 3.1 ns

Well yes, we are losing a quiet lot of speed

@LamentXU123
Copy link
Contributor Author

Correct. node ID is cached so even if its 25x slower we just need to generate it once.

@LamentXU123
Copy link
Contributor Author

LamentXU123 commented Jun 8, 2025

So the new changes would be only in _random_getnode:

def _random_getnode():
	import secrets
	return secrets.randbits(48) | (1 << 40)

and some notes in docs?

@picnixz
Copy link
Member

picnixz commented Jun 8, 2025

Yes. But prefer using os.urandom instead of secrets. os is likely already imported (and 48 is already a multiple of 8 so it's easy)

@LamentXU123
Copy link
Contributor Author

Ok. I will post the final changes soon at the linked PR

picnixz added a commit that referenced this issue Jun 8, 2025
…, §6.10.3 (#135226)

This aligns with the recommendations of RFC 9562, Section 6.10, paragraph 3 [1].

[1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-6.10-3.

---------

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 8, 2025
…C 9562, §6.10.3 (pythonGH-135226)

This aligns with the recommendations of RFC 9562, Section 6.10, paragraph 3 [1].

[1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-6.10-3.

---------
(cherry picked from commit 1cb7163)

Co-authored-by: LamentXU <108666168+LamentXU123@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
picnixz added a commit that referenced this issue Jun 8, 2025
…FC 9562, §6.10.3 (GH-135226) (#135255)

gh-135244: generate UUID random Node ID with a CSPRNG as per RFC 9562, §6.10.3 (GH-135226)

This aligns with the recommendations of RFC 9562, Section 6.10, paragraph 3 [1].

[1]: https://www.rfc-editor.org/rfc/rfc9562.html#section-6.10-3.

---------
(cherry picked from commit 1cb7163)

Co-authored-by: LamentXU <108666168+LamentXU123@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
@picnixz
Copy link
Member

picnixz commented Jun 8, 2025

For 3.13, I'll hold the backport because I need #134704, which I don't want to merge for a while (see #132901 (comment) for the rationale).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes deferred-blocker stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy