-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
py/qstr: Add support for sorted qstr pools. #12678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
1a01751
tests/perf_bench: Add string/qstr/map tests.
jimmo 78f4f30
tests/extmod/asyncio_as_uasyncio.py: Fix qstr order dependency.
jimmo e910533
bare-arm/lib: Add minimal strncmp implementation.
jimmo 64c79a5
py/qstr: Add support for sorted qstr pools.
jimmo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe more extreme version of the previous idea, but what about adding a constant lookup table (equivalent of
mp_binary_op_method_name
) for these? That would add 157 byte code size, but also allow this whole qstr pool to be sorted. In the next .mpy version bump the lookup table could be removed to get the code size back.(This would also potentially allow deleting the
!sorted
case from the search lookup code as all pools would be sorted, wouldn't it?)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea! It might be easier just to go with three sorted pools (because this list is almost sorted) as described above.
Unfortunately, any runtime qstrs are not sorted (because the pools are created incrementally), so the
!sorted
case needs to stay.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing it's not viable to insertion sort the runtime pool? EDIT: Of course it's not, the addresses need to be stable. Oh well.
I'll stop proposing ever more complex changes now, I promise. 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@projectgus
The main problem is that we return the qstr id at insertion time. This id is the index across all the pools (i.e. the first pool contains qstr 0 -> N, the second contains N+1 -> M, etc).
So sorting would require some way to know the final index in the pool, even though the sort ordering might change with further insertions.
There are easy ways to solve this with extra indirection of course, but that would be an extra
2*sizeof(qstr)
per entry in the runtime tables. Maybe this is worth considering?Also other ideas around some sort of hash-table-like approach. The problem with hashtables though is that the main expensive operation we perform on the pool is set-existence (i.e. we have this new string from somewhere, and we're trying to decide whether it's an existing qstr or not). And worse, the expected outcome for any given pool is negative, which is pathological for high-load-factor hashtables.