vllm support #1063

kczimm · 2023-10-11T20:42:55Z

Enables support for vLLM. To use, you must specify the model field in the task parameter of the pgml.transform function and you must add "backend": "vllm" in the task parameters. For example,

SELECT * FROM pgml.transform(
    task => '{"model":"tiiuae/falcon-7b","backend":"vllm"}'::JSONB,
    inputs => Array['hello']
);

A list of supported models for vLLM can be found here.

Only one vLLM model can be loaded per client connection process due to a limitation in vLLM. The first call to pgml.transform with a given model will load the model ("cold start"), but subsequent calls will use the cached model. If you change the specified model in the same client connection, the cached model will be replaced with the new one.

levkk · 2023-10-19T21:19:34Z

Rebase on master to get #1102 which should fix the tests.

kczimm marked this pull request as ready for review October 19, 2023 20:46

kczimm added 9 commits October 19, 2023 21:25

add vllm binding

635476c

add vllm SamplingParams

9360ef7

add test showing vllm model support check

8be0710

refactor into llm module, use PyResult

b212ee0

add vLLM to the transform API

746953e

make bindings vllm::outputs

ca7e4ad

swap out vLLM model if new

d017cd6

add vllm docs

74ce6ae

add vllm inference docs; fix logic

aca505c

kczimm force-pushed the kczimm-vllm-support branch from 5e20276 to aca505c Compare October 19, 2023 21:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vllm support #1063

vllm support #1063

Uh oh!

kczimm commented Oct 11, 2023 •

edited

Loading

Uh oh!

levkk commented Oct 19, 2023

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

vllm support #1063

Are you sure you want to change the base?

vllm support #1063

Uh oh!

Conversation

kczimm commented Oct 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

levkk commented Oct 19, 2023

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

kczimm commented Oct 11, 2023 •

edited

Loading