DOC: Update documentation for using natural sort with sort_values
#61979
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The previous documentation recommended to use the lambda function
lambda x: np.argsort(index_natsorted(x))
as a key argument tosort_values
. However, while this works when sorting on a single column, it causes incorrect sorting when sorting multiple columns with duplicated values. For example:Note how the
hours
column is sorted correctly, but themins
column isn't.This PR updates the documentation to use
natsort_keygen
, which is robust to sorting on multiple columns.Commit 2: Removes the calls to
natsort_keygen()
in the example code as the output generated was too long and doctest didn't seem to like having the tuple formatted.