Skip to content

[Feature Request] Add configuration to never unload a model #99

@Vaskivo

Description

@Vaskivo

Would it be possible to add an option to never unload a model?

Here's my usecase:

  • I use multiple different LLMs, for different purposes or even just for experimentation.
  • But the embeddings and reranking models are always the same.
  • Also, due to my coding assistant tool, the embeddings model runs quite often.

So I'd like to be able to:

  • Have the embeddings and reranking models always loaded and ready to go, and
  • Never unload the "main LLM" I'm using when I need to use the embeddings and reranking models

I guess I could achieve this with profiles. But it would require creating a new profile every time I add a new LLM (and I do that a lot).

Metadata

Metadata

Assignees

No one assigned

    Labels

    configurationrelated to configuration of llama-swapenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions