Skip to content

Commit fd471b8

Browse files
committed
gh-118761: Reduce import time of gettext.py by delaying re import
gettext is often imported in programs that may not end up translating anything. In fact, the `struct` module already has a delayed import when parsing GNUTranslations to speed up the no .mo files case. The re module is also used in the same situation, but behind a function chain only called by GNUTranslations. cache the compiled regex globally the first time it is used. The finditer function can be converted to a method call on the compiled object (it always could) which is slightly more efficient and necessary for the conditional re import.
1 parent d05140f commit fd471b8

File tree

2 files changed

+21
-15
lines changed

2 files changed

+21
-15
lines changed

Lib/gettext.py

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,6 @@
4848

4949
import operator
5050
import os
51-
import re
5251
import sys
5352

5453

@@ -70,22 +69,26 @@
7069
# https://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms
7170
# http://git.savannah.gnu.org/cgit/gettext.git/tree/gettext-runtime/intl/plural.y
7271

73-
_token_pattern = re.compile(r"""
74-
(?P<WHITESPACES>[ \t]+) | # spaces and horizontal tabs
75-
(?P<NUMBER>[0-9]+\b) | # decimal integer
76-
(?P<NAME>n\b) | # only n is allowed
77-
(?P<PARENTHESIS>[()]) |
78-
(?P<OPERATOR>[-*/%+?:]|[><!]=?|==|&&|\|\|) | # !, *, /, %, +, -, <, >,
79-
# <=, >=, ==, !=, &&, ||,
80-
# ? :
81-
# unary and bitwise ops
82-
# not allowed
83-
(?P<INVALID>\w+|.) # invalid token
84-
""", re.VERBOSE|re.DOTALL)
85-
72+
_token_pattern = None
8673

8774
def _tokenize(plural):
88-
for mo in re.finditer(_token_pattern, plural):
75+
global _token_pattern
76+
if _token_pattern is None:
77+
import re
78+
_token_pattern = re.compile(r"""
79+
(?P<WHITESPACES>[ \t]+) | # spaces and horizontal tabs
80+
(?P<NUMBER>[0-9]+\b) | # decimal integer
81+
(?P<NAME>n\b) | # only n is allowed
82+
(?P<PARENTHESIS>[()]) |
83+
(?P<OPERATOR>[-*/%+?:]|[><!]=?|==|&&|\|\|) | # !, *, /, %, +, -, <, >,
84+
# <=, >=, ==, !=, &&, ||,
85+
# ? :
86+
# unary and bitwise ops
87+
# not allowed
88+
(?P<INVALID>\w+|.) # invalid token
89+
""", re.VERBOSE|re.DOTALL)
90+
91+
for mo in _token_pattern.finditer(plural):
8992
kind = mo.lastgroup
9093
if kind == 'WHITESPACES':
9194
continue
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Reduce import time of :mod:`gettext` by up to ten times, by importing
2+
:mod:`re` on demand. In particular, ``re`` is no longer implicitly
3+
exposed as ``gettext.re``. Patch by Eli Schwartz.

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy