-
Notifications
You must be signed in to change notification settings - Fork 295
Drop deprecated sre_compile usage #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b683f2b
to
2cb090b
Compare
cpplint.py
Outdated
@@ -1049,7 +1048,7 @@ def Match(pattern, s): | |||
# performance reasons; factoring it out into a separate function turns out | |||
# to be noticeably expensive. | |||
if pattern not in _regexp_compile_cache: | |||
_regexp_compile_cache[pattern] = sre_compile.compile(pattern) | |||
_regexp_compile_cache[pattern] = re.compile(pattern) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any meaningful difference between re.compile
and re._compiler.compile
? I must admit I don't know, and I don't know why the original developers chose sre_compile.
On python 3.11
import re
import sre_compile
sre_compile.compile == re._compiler.compile
Out[11]: True
sre_compile.compile == re.compile
Out[12]: False
The deprecation thread does not help me either https://bugs.python.org/issue47152
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So re.compile() is here: https://github.com/python/cpython/blob/main/Lib/re/__init__.py#L280
It wraps re._compiler.compile()
here: https://github.com/python/cpython/blob/main/Lib/re/__init__.py#L307
It uses caching for compiled expressions.
So... they are basically the same, except re.compile() has caching and is the official (future-safe) api.
But cpplint has it's own implementation of caching already (without eviction), so it's a little silly to use re.compile which has caching and then also do caching in cpplint.
I believe re caches 256 compiled expressions, and cpplint has < 100 so far, with little reason to believe it will be more.
So if we do the migration in this PR, we can also remove the caching from cpplint (maybe check caching in python3.7 re)
@jspricke : Is that also your understanding, do you have a strong preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense to me, I've added a commit to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhh, sorry. Actually I worry a lot about this change. The number of used patterns is larger than the number of occurrences of Search
that I counted yesterday. Some of those occurrences use variables when building the pattern, like Search(r'new\(\S+\)\s*' + matched_type, line
, but then there is also Match()
. Now even if we did exceed the re.compile cache sizes, it might not cause significant additional slowdown if the LRU cache hit rate is high. Bug I don't know if I want to risk this. My concern is that for very large codebases, running cpplint could become much slower with this change.
So either we find some loadtest with a representative very large codebase and check there is no slowdown, or we keep the duplicate caching logic as documented technical debt (by adding a small comment where cpplint does caching). Or we make very damn certain that we stay below a certain number of patterns.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really mind, could you simply cherry-pick from this what you see fit or do you prefer if I rework this again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, don't worry I'll cherry-pick Sorry for being indecisive about this, only have limited time for cpplint.
https://build.opensuse.org/request/show/1078255 by user dgarcia + dimstar_suse - Add drop-sre-compile.patch upstream patch to fix issues with deprecated usage of sre_compile gh#cpplint/cpplint#214 - Update to 1.6.1 * Fix #195 Fix post increment/decrement operator causing a false positive. * Fix #202 .hh files should not be considered sytem headers * Fix #207 Python2 incompatibility for loading CPPLINT.cfg file * Fix #184 NOLINT(clang-analyzer) comments should not cause warnings - 1.6.0 (2022-02-19) * Fix #188: "Include the directory when naming header files" also for header files with other names like "*.hpp" - 1.5.5 (2021-05-20) * Fix #172: Added 'size_t' to typecasts detected by CheckCStyleCast * Fixed wrong CLI help text: Each filter needs + or - * Fix #164: add elif as an exception for CheckSpacingForFunctionCall() * Fix google#346: --root
Hi! I'm currently looking into applying this for our Python 3.11 rebuild on Arch Linux. Unfortunately this does not apply on top of 1.6.1. |
Remove unneeded dependencies. Remove use of legacy sre_compile: cpplint/cpplint#214 Remove use of pytest-runner (not needed for build). Simplify installation of files using install's -t switch. Remove the use of srcdir. git-svn-id: file:///srv/repos/svn-community/svn@1445958 9fca08f4-af9d-4005-b8df-a31f2cc04f65
Remove unneeded dependencies. Remove use of legacy sre_compile: cpplint/cpplint#214 Remove use of pytest-runner (not needed for build). Simplify installation of files using install's -t switch. Remove the use of srcdir. git-svn-id: file:///srv/repos/svn-community/svn@1445958 9fca08f4-af9d-4005-b8df-a31f2cc04f65
No description provided.