-
-
Notifications
You must be signed in to change notification settings - Fork 32.4k
gh-109934: notify cancelled futures on thread pool shutdown #134618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
When `ThreadPoolExecutor` shuts down it cancels any pending futures, however at present it doesn't notify waiters. Thus their state stays as `CANCELLED` instead of `CANCELLED_AND_NOTIFIED` and any waiters are not awakened. Call `set_running_or_notify_cancel` on the cancelled futures to fix this.
…n the ProcessPoolExecutor in NEWS.
… blocking future has started before checking its status.
This comment was marked as spam.
This comment was marked as spam.
What on earth possessed me to say "I think I've managed to come up with [a unit test] that works works reliably", I have no idea. Utter foolishness. Oh well, we'll get there. |
Been there, done that! In my local tests I just added a time-based approach that sets the the process at 10 seconds OR fail directly: and a generic maximum runtime of the test on 4s. Which is fine for the current project which also has a 'kill all children' directly following the shutdown. I think a good approach would perhaps be to add a generic Lock, have each process stall on that except for one. The one gets picked up in the first 'batch' and then immediately errors out. Then call for executor.shutdown and subsequent release the lock to free up the stalled processes. You could then assert for the final n tasks to be correctly canceled. I suspect a runtime variance here is that the failed task might or might not free-up a worker for the next Lock-blocked task; but perhaps the exector blocked while a future's completeness is being handled? |
Yeah, that is basically what the test does: submits a bunch of tasks, the first
This should be reliable, as the shutdown is initiated synchronously while all workers are blocked. At the point any of the tasks complete the executor must be shutdown and no additional pending tasks will be started. However, there are lots of tricky details in ensuring this all works robustly. E.g. once the executor is shutdown use of any synchronisation primitives may fail, depending on timing and details of the implementation, so we have to handle Also the internal multiprocessing machinery uses a work queue and considers tasks running (and no longer pending) once they are enqueued, even if they haven't been distributed to workers. It eagerly enqueues some extra tasks over-and-above the max number of workers to keep the pipeline filled. Those tasks will not be cancelled, but may or may not actually run, depending on timing. Anyway, hopefully with all of that taken into account the test is now robust and reliable! |
When
ThreadPoolExecutor
shuts down it cancels any pending futures, however at present it doesn't notify waiters. Thus their state stays asCANCELLED
instead ofCANCELLED_AND_NOTIFIED
and any waiters are not awakened.Call
set_running_or_notify_cancel
on the cancelled futures to fix this.