Skip to content

Added support for searching large amount of indices #412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

simonvb00
Copy link

@simonvb00 simonvb00 commented Jul 16, 2025

Description:
When searching the catalog with the /search-endpoint, a GET /<indices>/_search request is done with all indices listed in the URL path. However when doing such a search on a large amount of indices, it is possible that the size of the endpoint exceeds Elasticsearch’s maximum allowed HTTP line length (4096 bytes), resulting in the following error:

{"code":"RequestError","description":"RequestError(400, 'too_long_http_line_exception', 'An HTTP line is larger than 4096 bytes.')"}

The solution in this commit moves the indices from the endpoint to the body of the request once the amount of indices passes a certain threshold. The indices of the endpoint will be replaced by ITEM_INDICES. Since the query still filters on the correct indices, this change preserves the behavior while avoiding the URL length limitation.

PR Checklist:

  • Code is formatted and linted (run pre-commit run --all-files)
  • Tests pass (run make test)
  • Documentation has been updated to reflect changes, if applicable
  • Changes are added to the changelog

…ices from the request url to the body of the request when size is larger than 4096 bytes.
@simonvb00 simonvb00 changed the title Added support for searching large amount of indices by moving the ind… Added support for searching large amount of indices Jul 16, 2025
StijnCaerts
StijnCaerts previously approved these changes Jul 16, 2025
Copy link
Collaborator

@StijnCaerts StijnCaerts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, I think this will be a useful fix for large catalogs, especially now that people are looking for ways to limit searches to certain collections based on authorizations (#409).

StijnCaerts
StijnCaerts previously approved these changes Jul 17, 2025
…ort for large amount of queries to ElasticSearch database logic.
@simonvb00 simonvb00 requested a review from jonhealy1 July 18, 2025 11:40
Copy link
Collaborator

@jonhealy1 jonhealy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy