-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
asyncio: slight optimizations for run_until_complete
and sleep_ms
#17699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
asyncio: slight optimizations for run_until_complete
and sleep_ms
#17699
Conversation
Calculate ~POLLIN and ~POLLOUT as constants to remove the runtime cost of continuously calculating them. And unpack the queue entry rather than using repeated item lookups. Additionally, avoid call to max() in sleep_ms. Generally, the waittime specified will not be negative, so the call to `max` should generally not be needed. Instead, the code will either call `ticks_add` if `t` is positive or else use the current ticks time.
run_until_complete
and sleep_ms
Code size report:
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #17699 +/- ##
==========================================
- Coverage 98.54% 98.44% -0.10%
==========================================
Files 169 171 +2
Lines 21890 22208 +318
==========================================
+ Hits 21571 21863 +292
- Misses 319 345 +26 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@@ -54,7 +55,8 @@ def __next__(self): | |||
# Use a SingletonGenerator to do it without allocating on the heap | |||
def sleep_ms(t, sgen=SingletonGenerator()): | |||
assert sgen.state is None | |||
sgen.state = ticks_add(ticks(), max(0, t)) | |||
now = ticks() | |||
sgen.state = ticks_add(now, t) if t > 0 else now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this give a measurable speed improvement? Is it worth it for the cost in code size?
I measure this change here as +5 bytes to the bytecode. The most taken path will be when ticks_add()
needs to be called, which goes from 12 opcodes previously to now 16 opcodes. It's usually the opcode overhead that's slow, rather than the actual call (eg out to max
, which should be quick with two small int args). So I would guess that this change actually makes things a little slower.
dt = max(0, ticks_diff(t.ph_key, ticks())) | ||
dt = ticks_diff(t.ph_key, ticks()) | ||
if dt < 0: | ||
dt = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, does this change here actually make things faster?
Summary
This is aimed at improving the loop timing of the asyncio core loop. It makes a few small optimizations to the core and realizes about a 20% impact in overall performance.
select
module when used.sleep_ms
,max
is not used for each call. Instead, anif
expression handles the case whent
is negative.run_until_complete
, a call tomax
is avoidedrun_until_complete
the methods for the task and IO queues are only looked up once.Testing
I ran two tests on three platforms. Source code is given below. The tight-loop just runs a single task as quickly as possible. The second task uses a ThreadSafeFlag to run two tasks as quickly as possible but requires IO polling between the tasks.
tight-loop test
io-poll test