Skip to content

bpo-30038: fix race condition in signal delivery + wakeup fd #1082

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 16, 2017

Conversation

njsmith
Copy link
Contributor

@njsmith njsmith commented Apr 11, 2017

Before, it was possible to get the following sequence of
events (especially on Windows, where the C-level signal handler for
SIGINT is run in a separate thread):

  • SIGINT arrives
  • trip_signal is called
  • trip_signal writes to the wakeup fd
  • the main thread wakes up from select()-or-equivalent
  • the main thread checks for pending signals, but doesn't see any
  • the main thread drains the wakeup fd
  • the main thread goes back to sleep
  • trip_signal sets is_tripped=1 and calls Py_AddPendingCall to notify
    the main thread the it should run the Python-level signal handler
  • the main thread doesn't notice because it's asleep

This has been causing repeated failures in the Trio test suite:
python-trio/trio#119

@mention-bot
Copy link

@njsmith, thanks for your PR! By analyzing the history of the files in this pull request, we identified @rosslagerwall, @taleinat and @tiran to be potential reviewers.

@njsmith
Copy link
Contributor Author

njsmith commented Apr 11, 2017

@Haypo may also be relevant, as AFAICT he was the main author of trip_signal and the wakeup fd functionality

@njsmith
Copy link
Contributor Author

njsmith commented May 3, 2017

3 week ping.

This sounds complicated, but it's pretty straightforward really: if you're using a condition variable, e.g. to important a queue, you set the new value then signal the condition to wake up anyone who was waiting. If you do it in the other order, your queue might deadlock. Similarly, here we need to set the variable letting the main thread know that a signal has arrived before we wake it up. The patch is just swapping the order of these two things.

Unfortunately, like any race condition, it's difficult to test, but if anyone has a Windows build setup then there's a reproducer in the linked bpo issue. And even if not, the current approach is obviously broken by inspection.

}

/* And then write to the wakeup fd *after* setting all the globals and
doing the Py_AddPendingCall (bpo-30038) */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see here the rationale for setting is_tripped before writing into the wake up fd. You already wrote it in the http://bugs.python.org/issue30038 and in the PR description. Just copy it here please ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Before, it was possible to get the following sequence of
events (especially on Windows, where the C-level signal handler for
SIGINT is run in a separate thread):

- SIGINT arrives
- trip_signal is called
- trip_signal writes to the wakeup fd
- the main thread wakes up from select()-or-equivalent
- the main thread checks for pending signals, but doesn't see any
- the main thread drains the wakeup fd
- the main thread goes back to sleep
- trip_signal sets is_tripped=1 and calls Py_AddPendingCall to notify
  the main thread the it should run the Python-level signal handler
- the main thread doesn't notice because it's asleep

This has been causing repeated failures in the Trio test suite:
  python-trio/trio#119
@njsmith njsmith force-pushed the bpo-30038-wakeup-fd-race-condition branch from aabe7f1 to e357d5d Compare May 16, 2017 07:22
@codecov
Copy link

codecov bot commented May 16, 2017

Codecov Report

Merging #1082 into master will decrease coverage by 1.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1082      +/-   ##
==========================================
- Coverage    83.7%   82.69%   -1.02%     
==========================================
  Files        1371     1432      +61     
  Lines      346665   353018    +6353     
==========================================
+ Hits       290179   291920    +1741     
- Misses      56486    61098    +4612
Impacted Files Coverage Δ
Lib/threading.py 81.83% <0%> (-0.35%) ⬇️
Lib/test/lock_tests.py 86.66% <0%> (-0.29%) ⬇️
Lib/test/test_random.py 98.48% <0%> (-0.19%) ⬇️
Lib/pydoc.py 61.94% <0%> (-0.07%) ⬇️
Tools/scripts/byext.py 10.09% <0%> (ø)
Tools/scripts/ptags.py 20% <0%> (ø)
Tools/scripts/fixdiv.py 9.25% <0%> (ø)
Tools/scripts/pindent.py 13.76% <0%> (ø)
Tools/scripts/nm2def.py 23.4% <0%> (ø)
Tools/scripts/dutree.py 12.24% <0%> (ø)
... and 56 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1bd7d29...e357d5d. Read the comment docs.

@vstinner vstinner merged commit 4ae0149 into python:master May 16, 2017
vstinner added a commit that referenced this pull request Jun 10, 2017
…2075)

Before, it was possible to get the following sequence of
events (especially on Windows, where the C-level signal handler for
SIGINT is run in a separate thread):

- SIGINT arrives
- trip_signal is called
- trip_signal writes to the wakeup fd
- the main thread wakes up from select()-or-equivalent
- the main thread checks for pending signals, but doesn't see any
- the main thread drains the wakeup fd
- the main thread goes back to sleep
- trip_signal sets is_tripped=1 and calls Py_AddPendingCall to notify
  the main thread the it should run the Python-level signal handler
- the main thread doesn't notice because it's asleep

This has been causing repeated failures in the Trio test suite:
  python-trio/trio#119
(cherry picked from commit 4ae0149)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy