Content-Length: 406192 | pFad | http://github.com/internetarchive/openlibrary/pull/10460

52 Fix issue #2723: Improve list search functionality by isabellabonilla · Pull Request #10460 · internetarchive/openlibrary · GitHub

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Fix issue #2723: Improve list search functionality #10460

Open

isabellabonilla wants to merge 2 commits into internetarchive:master from isabellabonilla:FixListSearch

isabellabonilla commented Feb 17, 2025

Closes #2723

Fix: Improves list search functionality to allow partial matches in list titles.

Technical

Added a new ListSearchScheme to leverage Solr for list searches.
Updated /search/lists controller to use the new Solr-backed search.
Refactored the template (lists.html) to align with patterns used in other templates like authors.html and subjects.html to display improved search results.
Added pagination support to /search/lists, allowing users to navigate through large sets of search results.
Implemented error handling for list searches, displaying errors when they occur.
Changes were made in:
- openlibrary/plugins/worksearch/code.py
- openlibrary/plugins/worksearch/schemes/lists.py (new file)
- openlibrary/templates/search/lists.html

Testing

Navigate to /search/lists.
Search for a term that is part of a list title (e.g., "banned").
Verify that all lists containing the search term in their titles are displayed, not just exact matches.

Stakeholders

@cdrini
@seabelis
@el4ctr0n


          Fix issue internetarchive#2723: Improve list search functionality

4b74a06

github-actions bot assigned cdrini

github-actions bot added the Priority: 2 label


          [pre-commit.ci] auto fixes from pre-commit.com hooks

7312d57

for more information, see https://pre-commit.ci

cdrini requested changes

View reviewed changes

Collaborator

cdrini left a comment •

edited

Loading

Awesome great work @isabellabonilla ! This was a trickier issue, nice work getting all the difference pieces implemented and playing together -- and as a first time contributor! 😊

A few pieces of feedback to fix some issues with wrong field names, and allowing for backwards compatibility of the public search API 👍

openlibrary/plugins/worksearch/schemes/lists.py

+                      'key',  # unique identifier for the list
+                      'name',  # name/title of the list
+                      'description',  # short description of the list
+                      'created',  # timestamp when the list was created

Collaborator

cdrini Feb 18, 2025

This field unfortunately doesn't exist in our solr, so we can't use it.

Suggested change

'created', # timestamp when the list was created

openlibrary/plugins/worksearch/schemes/lists.py

+                  all_fields = {
+                      'key',  # unique identifier for the list
+                      'name',  # name/title of the list
+                      'description',  # short description of the list

Collaborator

cdrini Feb 18, 2025

Ditto with this one, but this does exist in our db, so you can add it to the non_solr_fields so that it can be fetched for reading.

openlibrary/plugins/worksearch/schemes/lists.py

Comment on lines +26 to +27

		'created desc': 'created desc', # sort by newest lists first (default)
		'created asc': 'created asc', # sort by oldest lists first

Collaborator

cdrini Feb 18, 2025

Ditto here, these fields don't exist in our solr.

Suggested change

      
                    'created desc': 'created desc',  # sort by newest lists first (default)
          
                    'created asc': 'created asc',  # sort by oldest lists first

openlibrary/plugins/worksearch/schemes/lists.py

+                      'name',  # name/title of the list
+                      'description',  # short description of the list
+                      'created',  # timestamp when the list was created
+                  }

Collaborator

cdrini Feb 19, 2025

We do have a few other keys that will be available for search:

Suggested change

      
                }
          
                    "subject",
          
                    "subject_key",
          
                    "person",
          
                    "person_key",
          
                    "place",
          
                    "place_key",
          
                    "time",
          
                    "time_key",
          
                }

openlibrary/plugins/worksearch/schemes/lists.py

Comment on lines +39 to +40

		'description',
		'created',

Collaborator

cdrini Feb 19, 2025

Suggested change

      
                    'description',
          
                    'created',

openlibrary/plugins/worksearch/code.py

Comment on lines +643 to +644

		fields=fields,
		sort="created_desc", # default sorting with most recently created lists first

Collaborator

cdrini Feb 19, 2025

Suggested change

      
                        fields=fields,
          
                        sort="created_desc",  # default sorting with most recently created lists first

openlibrary/plugins/worksearch/code.py

                       web.header('Content-Type', 'application/json')
-                      return delegate.RawText(json.dumps(response))
+                      return delegate.RawText(json.dumps(raw_resp))

Collaborator

cdrini Feb 19, 2025

Ah this is going to be a breaking change to our public lists search API... we'll likely have to have both for a little while. To do this, let's add this code:

Suggested change

      
                    return delegate.RawText(json.dumps(raw_resp))
          
                    if i.api == 'next':
          
                        return delegate.RawText(json.dumps(raw_resp))
          
                    else:
          
                        # Default to the old API shape for a while, then we'll flip
          
                        return delegate.RawText(json.dumps({
          
                            'start': offset,
          
                            'docs': [
          
                                lst.preview()
          
                                for lst in web.ctx.site.get_many(doc['key'] for doc in raw_resp['docs'])
          
                            ]
          
                       })

This will let us use the better solr-based search, while maintaining the same shape of response.

openlibrary/plugins/worksearch/code.py

-                              "limit": int(limit),
-                              "offset": int(offset),
-                          }
+                  def get_results(elf, q, offset=0, limit=100, fields='*', sort=''):

Collaborator

cdrini Feb 19, 2025

We want to avoid defaulting to * now since that results in large responses ; I reckon you got this from the authors page which we have yet to fix 😁 Defaulting it to None should do the trick 👍 The down stream code should then just the default fields from the Scheme.

Suggested change

      
                def get_results(elf, q, offset=0, limit=100, fields='*', sort=''):
          
                def get_results(elf, q, offset=0, limit=100, fields=None, sort=''):

openlibrary/plugins/worksearch/code.py


		docs = self.get_results(i.q, offset=offset, limit=limit)
		response = self.get_results(i.q, offset=offset, limit=limit)

Collaborator

cdrini Feb 19, 2025 •

edited

Loading

Let's also add the sort=i.sort and fields=i.fields.split(',') if i.fields else None so we can test those are working correctly.

cdrini added the Needs: Submitter Input label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs: Submitter Input Priority: 2