Skip to content

Commit b5a9ef4

Browse files
authored
fix: Do not send an empty 'tools' list to remote vllm (meta-llama#1957)
Fixes: meta-llama#1955 Since 0.2.0, the vLLM gets an empty list (vs ``None``in 0.1.9 and before) when there are no tools configured which causes the issue described in meta-llama#1955 p. This patch avoids sending the 'tools' param to the vLLM altogether instead of an empty list. It also adds a small unit test to avoid regressions. The OpenAI [specification](https://platform.openai.com/docs/api-reference/chat/create) does not explicitly state that the list cannot be empty but I found this out through experimentation and it might depend on the actual remote vllm. In any case, as this parameter is Optional, is best to skip it altogether if there's no tools configured. Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
1 parent fb8ff77 commit b5a9ef4

File tree

2 files changed

+19
-2
lines changed

2 files changed

+19
-2
lines changed

llama_stack/providers/remote/inference/vllm/vllm.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -374,7 +374,8 @@ async def _get_params(self, request: Union[ChatCompletionRequest, CompletionRequ
374374
options["max_tokens"] = self.config.max_tokens
375375

376376
input_dict: dict[str, Any] = {}
377-
if isinstance(request, ChatCompletionRequest) and request.tools is not None:
377+
# Only include the 'tools' param if there is any. It can break things if an empty list is sent to the vLLM.
378+
if isinstance(request, ChatCompletionRequest) and request.tools:
378379
input_dict = {"tools": _convert_to_vllm_tools_in_request(request.tools)}
379380

380381
if isinstance(request, ChatCompletionRequest):

tests/unit/providers/inference/test_remote_vllm.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,12 @@
2626
)
2727
from openai.types.model import Model as OpenAIModel
2828

29-
from llama_stack.apis.inference import ToolChoice, ToolConfig
29+
from llama_stack.apis.inference import (
30+
ChatCompletionRequest,
31+
ToolChoice,
32+
ToolConfig,
33+
UserMessage,
34+
)
3035
from llama_stack.apis.models import Model
3136
from llama_stack.models.llama.datatypes import StopReason
3237
from llama_stack.providers.remote.inference.vllm.config import VLLMInferenceAdapterConfig
@@ -232,3 +237,14 @@ async def do_chat_completion():
232237
# above.
233238
asyncio_warnings = [record.message for record in caplog.records if record.name == "asyncio"]
234239
assert not asyncio_warnings
240+
241+
242+
@pytest.mark.asyncio
243+
async def test_get_params_empty_tools(vllm_inference_adapter):
244+
request = ChatCompletionRequest(
245+
tools=[],
246+
model="test_model",
247+
messages=[UserMessage(content="test")],
248+
)
249+
params = await vllm_inference_adapter._get_params(request)
250+
assert "tools" not in params

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy