-
-
Notifications
You must be signed in to change notification settings - Fork 826
Closed
Labels
feature requestNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomerspri/mediumMed priority issueMed priority issuestatus/approvedThis issue is ready to be implementedThis issue is ready to be implemented
Milestone
Description
Describe the feature you'd like
I would like the inference queue to allow parallel execution of jobs.
Describe the benefits this would bring to existing Hoarder users
The use case is availability of multiple load ballanced ollama backends that would speed up processing.
Can the goal of this request already be achieved via other means?
Not easily.
- faster gpu with more ram
- multiple gpus on the same host
- something else than ollama that can run jobs distributed on multiple machines (like https://github.com/exo-explore/exo maybe?)
Have you searched for an existing open/closed issue?
- I have searched for existing issues and none cover my fundamental request
Additional context
No response
Metadata
Metadata
Assignees
Labels
feature requestNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomerspri/mediumMed priority issueMed priority issuestatus/approvedThis issue is ready to be implementedThis issue is ready to be implemented