Skip to content

Worse speed and GPU load than pure llama-cpp #1831

Answered by Mushoz
Mushoz asked this question in Q&A
Discussion options

You must be logged in to vote

Managed to find the answer myself. For some reason the logits_all parameter defaults to true and tanks performance. Setting it to false brings the performance on par with pure llama-cpp. Not sure if that's a sensible default, but at least I managed to solve the problem. GPU load is also back to 100% again.

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@ExtReMLapin
Comment options

@gl2007
Comment options

Answer selected by Mushoz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy