tests/extmod/select_poll_eintr: Skip unreliable test in Github CI. #17745
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
extmod/select_poll_eintr.py
is a constant source of spurious failures in Github CI.This PR adds it to the list of tests skipped when running on Github CI, to help reduce the overall false positive rate and improve the predictive value of the test fail indication.
Testing
I exampled a sample of the last 25 failed Github Actions runs, tabulated their causes, and calculated relevant confusion matrix statistics over the results to determine that there is in fact adequate statistical evidence to support my original anecdotal experience with
extmod/select_poll_eintr.py
being problematic.stackless_clang
extmod/select_poll_eintr.py
settrace_stackless
extmod/select_poll_eintr.py
settrace_stackless
extmod/select_poll_eintr.py
stackless_clang
macos
10 other jobs
extmod/select_poll_eintr.py
extmod/select_poll_eintr.py basics/slice_optimse.py
basics/slice_optimse.py
standard_v2
longlong
extmod/select_poll_eintr.py
(many failures)
standard_v2
extmod/select_poll_eintr.py
standard_v2
settrace_stackless
extmod/select_poll_eintr.py
extmod/select_poll_eintr.py
(Note that
reproducible
was excluded from tabulation as it doesn't runextmod/select_poll_eintr.py
)20 of these 25 examined runs include
extmod/select_poll_eintr.py
as a failure, compared to only 6 runs that include any other kind of failure.As far as I can tell, none of these failures have anything to do with changes made to the
select
module in the triggering branch, making all but the one run that also included another failure false positives.Over the same sample period, there were a total of 9 passing unix runs. Under the assumption that all 6 non-
extmod/select_poll_eintr.py
failed runs are true positives and that all 9 of these passing runs are true negatives, that gives the test suite withextmod/select_poll_eintr.py
included a false positive rate of 67.8%, a positive predictive value of only 24%, and an F1 score of 0.387. These values support the conclusion that the rate of spurious failures is excessive, and that the usefulness of the CI failure indicator is diluted as a result.Considering
extmod/select_poll_eintr.py
individually, this test has a per-job false positive rate of 5.5% and a per-run fpr of 60.6%. This supports the conclusion that the weak predictive value of the test suite is largely attributable to this test.Overall, the sample I examined supports the conclusion that
extmod/select_poll_eintr.py
is problematic should be excluded from Github CI runs going forward.Statistics Code, for anyone who cares to check my math:
Output: