-
Notifications
You must be signed in to change notification settings - Fork 89
Closed
Labels
configurationrelated to configuration of llama-swaprelated to configuration of llama-swapenhancementNew feature or requestNew feature or request
Description
Would it be possible to add an option to never unload a model?
Here's my usecase:
- I use multiple different LLMs, for different purposes or even just for experimentation.
- But the embeddings and reranking models are always the same.
- Also, due to my coding assistant tool, the embeddings model runs quite often.
So I'd like to be able to:
- Have the embeddings and reranking models always loaded and ready to go, and
- Never unload the "main LLM" I'm using when I need to use the embeddings and reranking models
I guess I could achieve this with profiles. But it would require creating a new profile every time I add a new LLM (and I do that a lot).
strawberrymelonpanda and CCoffie
Metadata
Metadata
Assignees
Labels
configurationrelated to configuration of llama-swaprelated to configuration of llama-swapenhancementNew feature or requestNew feature or request