Skip to content

Added pgml.rank docs #1514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 12, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions pgml-cms/docs/api/sql-extension/pgml.rank.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
description: Rank documents against a piece of text using the specified ranking model.
---

# pgml.rank()

The `pgml.rank()` function is used to compute a relevance score between documents and some text. This function is primarily used as the last step in a search system where the results returned from the initial search are re-ranked by relevance before being used.

## API

```postgresql
pgml.rank(
transformer TEXT, -- transformer name
query TEXT, -- text to rank against
documents TEXT[], -- documents to rank
kwargs JSON -- optional arguments (see below)
)
```

## Example

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this look: f9c5ff2

Ranking documents is as simple as calling the the function with the documents you want to rank, and text you want to rank against:

```postgresql
SELECT pgml.rank('mixedbread-ai/mxbai-rerank-base-v1', 'test', ARRAY['doc1', 'doc2']);
```

By default the `pgml.rank()` function will return and rank all of the documents. The function can be configured to only return the relevance score and index of the top k documents by setting `return_documents` to `false` and `top_k` to the number of documents you want returned.

```postgresql
SELECT pgml.rank('mixedbread-ai/mxbai-rerank-base-v1', 'test', ARRAY['doc1', 'doc2'], '{"return_documents": false, "top_k": 10}'::JSONB);
```

## Supported ranking models

We currently support cross-encoders for re-ranking. Check out [Sentence Transformer's documentation](https://sbert.net/examples/applications/cross-encoder/README.html) for more information on how cross-encoders work.

By default we provide the following ranking models:

* `mixedbread-ai/mxbai-rerank-base-v1`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I use other models from hugging face? There's no mention here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you can. If you are using a serverless hosted database you can request for that model specifically. Is there a model you are interested in using that is not here?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure that I can use any because embedding and reranking models will get update regularly.
I currently pick:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those should work. If you are using serverless and want to use them, you will need to request access to them. Let me know if you have any problems and I'm happy to take a look!

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy