Skip to content

Fix memory issue with llama.cpp LLM pipeline #824

@davidmezzetti

Description

@davidmezzetti

The current behavior of the llama.cpp LLM pipeline is to always set n_ctx=0. When n_ctx=0, the context size defaults to n_ctx_train which can be very large with some models.

This change will fallback to the default n_ctx when n_ctx=0 fails due to being out of memory. It will also allow n_ctx as a input parameter. If a manually set n_ctx is too large, this will fail since it's user-specified.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions