asyncio: slight optimizations for `run_until_complete` and `sleep_ms` #17699

greezybacon · 2025-07-17T13:42:07Z

Summary

This is aimed at improving the loop timing of the asyncio core loop. It makes a few small optimizations to the core and realizes about a 20% impact in overall performance.

In the IO poll method, the POLLIN and POLLOUT constants are looked up in the local module context rather than in the select module when used.
In sleep_ms, max is not used for each call. Instead, an if expression handles the case when t is negative.
In run_until_complete, a call to max is avoided
In run_until_complete the methods for the task and IO queues are only looked up once.

Testing

I ran two tests on three platforms. Source code is given below. The tight-loop just runs a single task as quickly as possible. The second task uses a ThreadSafeFlag to run two tasks as quickly as possible but requires IO polling between the tasks.

test	platform	base (v1.25.0)	PR	Change
tight-loop	unix (ubuntu 22.04 on Mac M2)	1.45us	1.05us	-28%
	mimxrt (Teensy 4.1)	49us	32us	-34%
	rp2 (W5100S EVB PICO @ 125MHz)	621us	476us	-23.3%
io-poll	unix	2724us	2724us	(none)
	mimxrt	252us	199us	-21%
	rp2	2107us	1713us	-18.7%

tight-loop test

import asyncio

async def count():
    global counter
    while True:
        await asyncio.sleep_ms(0)
        counter += 1

try:
    counter = 0
    asyncio.run(asyncio.wait_for(count(), timeout=2))
finally:
    print(counter, 2e6/counter)

io-poll test

import asyncio

flag = asyncio.ThreadSafeFlag()
async def sender():
    while True:
        flag.set()
        await asyncio.sleep_ms(0)

async def recv():
    global counter
    while True:
        await flag.wait()
        counter += 1

counter = 0
try:
    asyncio.create_task(sender())
    asyncio.run(asyncio.wait_for(recv(), timeout=2))
finally:
    if counter:
        print(counter, 2e6/counter)

github-actions · 2025-07-17T13:52:33Z

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:  +112 +0.013% standard
      stm32:   +28 +0.007% PYBV10
     mimxrt:   +32 +0.009% TEENSY40
        rp2:   +32 +0.003% RPI_PICO_W
       samd:   +40 +0.015% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

codecov · 2025-07-17T13:57:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.38%. Comparing base (f498a16) to head (f7c769c).
⚠️ Report is 452 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #17699      +/-   ##
==========================================
- Coverage   98.54%   98.38%   -0.16%     
==========================================
  Files         169      171       +2     
  Lines       21890    22239     +349     
==========================================
+ Hits        21571    21880     +309     
- Misses        319      359      +40

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

dpgeorge · 2025-07-18T13:26:25Z

extmod/asyncio/core.py

@@ -54,7 +55,8 @@ def __next__(self):
 # Use a SingletonGenerator to do it without allocating on the heap
 def sleep_ms(t, sgen=SingletonGenerator()):
    assert sgen.state is None
-    sgen.state = ticks_add(ticks(), max(0, t))
+    now = ticks()
+    sgen.state = ticks_add(now, t) if t > 0 else now


Does this give a measurable speed improvement? Is it worth it for the cost in code size?

I measure this change here as +5 bytes to the bytecode. The most taken path will be when ticks_add() needs to be called, which goes from 12 opcodes previously to now 16 opcodes. It's usually the opcode overhead that's slow, rather than the actual call (eg out to max, which should be quick with two small int args). So I would guess that this change actually makes things a little slower.

@dpgeorge thank you for asking. I spent too long working on this. It turns out that it does make a performance improvement on all platforms (ie. both Unix and MCUs), but it isn't substantial. I'm happy to remove it and search for a more substantial impact.

Thanks for investigating. Unless it's a significant improvement, I'd prefer to leave it as-is (ie prefer shorter bytecode over performance).

extmod/asyncio/core.py

Calculate ~POLLIN and ~POLLOUT as constants to remove the runtime cost of continuously calculating them. And unpack the queue entry rather than using repeated item lookups. Additionally, avoid call to max() in sleep_ms. Generally, the waittime specified will not be negative, so the call to `max` should generally not be needed. Instead, the code will either call `ticks_add` if `t` is positive or else use the current ticks time.

jepler

A few suggestions, but please feel free to ignore.

jepler · 2025-07-29T01:01:56Z

extmod/asyncio/core.py

@@ -3,6 +3,7 @@

 from time import ticks_ms as ticks, ticks_diff, ticks_add
 import sys, select
+from select import POLLIN, POLLOUT


alas I don't think it works to write POLLIN = const(select.POLLIN). I wonder if POLLIN = const(1); assert POLLIN == select.POLLIN benefits performance enough that it would be worth doing. (is micropython asyncio supposed to be cpython compatible? I guess there's no guarantee of the value of POLLIN/POLLOUT constants there. But there's no time.ticks_ms so probably this is a non-goal)

jepler · 2025-07-29T01:02:23Z

extmod/asyncio/core.py

        else:
            sm = self.map[id(s)]
            assert sm[idx] is None
            assert sm[1 - idx] is not None
            sm[idx] = cur_task
-            self.poller.modify(s, select.POLLIN | select.POLLOUT)
+            self.poller.modify(s, POLLIN | POLLOUT)


any measurable benefit to having POLLANY = POLLIN | POLLOUT to avoid the calculation here?

greezybacon changed the title ~~asyncio: Make slight optimizations for IOQueue.wait_io_event~~ asyncio: slight optimizations for run_until_complete and sleep_ms Jul 17, 2025

greezybacon mentioned this pull request Jul 17, 2025

asyncio: Properly cancel the main task on exception #9870

Closed

dpgeorge reviewed Jul 18, 2025

View reviewed changes

extmod/asyncio/core.py Outdated Show resolved Hide resolved

greezybacon force-pushed the fix/improve-asyncio-core-loop-perf branch from b4a3017 to f7c769c Compare July 28, 2025 02:05

dpgeorge added the extmod Relates to extmod/ directory in source label Jul 28, 2025

dpgeorge added this to the release-1.27.0 milestone Jul 28, 2025

jepler approved these changes Jul 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

asyncio: slight optimizations for `run_until_complete` and `sleep_ms` #17699

asyncio: slight optimizations for `run_until_complete` and `sleep_ms` #17699

greezybacon commented Jul 17, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 17, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 17, 2025 •

edited

Loading

Uh oh!

dpgeorge Jul 18, 2025

Uh oh!

greezybacon Jul 28, 2025

Uh oh!

dpgeorge Jul 28, 2025

Uh oh!

Uh oh!

jepler left a comment

Uh oh!

jepler Jul 29, 2025

Uh oh!

jepler Jul 29, 2025

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Uh oh!

asyncio: slight optimizations for run_until_complete and sleep_ms #17699

Are you sure you want to change the base?

asyncio: slight optimizations for run_until_complete and sleep_ms #17699

Conversation

greezybacon commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Uh oh!

github-actions bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dpgeorge Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

greezybacon Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

dpgeorge Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jepler left a comment

Choose a reason for hiding this comment

Uh oh!

jepler Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

jepler Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

asyncio: slight optimizations for `run_until_complete` and `sleep_ms` #17699

asyncio: slight optimizations for `run_until_complete` and `sleep_ms` #17699

greezybacon commented Jul 17, 2025 •

edited

Loading

github-actions bot commented Jul 17, 2025 •

edited

Loading

codecov bot commented Jul 17, 2025 •

edited

Loading