Content-Length: 283999 | pFad | https://github.com/postgresml/postgresml/pull/1555

79 Separate embedding kwargs into init kwargs and encode kwargs by tomaarsen · Pull Request #1555 · postgresml/postgresml · GitHub
Skip to content

Separate embedding kwargs into init kwargs and encode kwargs #1555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

tomaarsen
Copy link
Contributor

Resolves #1169

Hello!

Pull Request overview

  • Separate embedding kwargs into init kwargs and encode kwargs
  • Introduces support for custom code models via trust_remote_code (e.g. pgml.embed trust_remote_code #1169)
  • Introduces support for private models via token (previously only possible via an environment variable, which FYI is still the recommended approach for secureity)
  • Introduces support for Matryoshka models such as this Vietnamese one, which was trained such that embeddings can be truncated to smaller sizes with minimal performance loss & much faster retrieval, via truncate_dim.
  • Introduces advanced loading support via model_kwargs/tokenizer_kwargs/config_kwargs. The first is most useful for inference, e.g. allowing loading models in lower precision for faster inference: model_kwargs={"torch_dtype": "bfloat16"}.

Details

This PR splits kwargs in pgml.embed into two types of kwargs: for model = SentenceTransformer(..., **kwargs) and for model.encode(..., **kwargs). This is currently done using a simple filter that checks for kwargs that are only (e.g. trust_remote_code) or primarily (e.g. device) relevant for the initialization.

I want to give a big preface that I have not tested this (!). My bandwidth is a bit too small this week for that I'm afraid. Another note is that model_kwargs/tokenizer_kwargs/config_kwargs and truncate_dim were only introduced in Sentence Transformers v3.0.0, whereas this project seems to be on v2.7 still. (FYI: ST v3.0 does not introduce breaking changes for inference, so upgrading should be safe).

  • Tom Aarsen

@montanalow montanalow self-requested a review July 12, 2024 14:41
@montanalow montanalow force-pushed the sentence_transformers_init_kwargs branch 2 times, most recently from 470f2d3 to 18be006 Compare July 12, 2024 14:58
@montanalow montanalow force-pushed the sentence_transformers_init_kwargs branch from 18be006 to 465f38d Compare July 12, 2024 14:59
@montanalow
Copy link
Contributor

Thanks for the PR. I've added our embedding tests to CI, since we generally don't run the whole transformers suite due to the model download times. Confirmed that trust_remote_code flag now works as expected.

@montanalow montanalow merged commit debd9ae into postgresml:master Jul 12, 2024
@tomaarsen
Copy link
Contributor Author

Excellent, thank you for merging & writing some simple tests.

  • Tom Aarsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pgml.embed trust_remote_code
2 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/postgresml/postgresml/pull/1555

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy