Neuro LLM Server

READ THIS:

This is an extremely jank and hacky implementation of an OpenAI API server for serving the MiniCPM-Llama3-V 2.5 LLM. I just wanted a straightforward way to use this model over an OpenAI API compliant endpoint, so I hacked this thing together. Most parameters don't even do anything, it's barely enough to send queries and recieve answers.

Info

MiniCPM-Llama3-V 2.5 is a relatively small but very high quality multimodal llm. It's finally something small and good enough to use for Neuro! This program will download and use the int4 quantized version which takes up approximately 8GB of VRAM. Since Neuro right now requires the full message before talking (defeating the purpose of streaming responses, I know), the end-of-speech to start-of-speech latency is 3-10 seconds depending on the length of the output.

Right now, the server is only set up to do streaming responses (Which is what Neuro uses). I may (or may not) add a few more features and make the endpoint more compliant to the OpenAI spec.

Installation

Just don't, but if you really want to...

Only tested on linux. I want to get Windows working too at some point, but it doesn't work right now.

conda create -n MiniCPM-V python=3.10 -y
conda activate MiniCPM-V
pip install -r requirements.txt
fastapi run

btw fastapi dev does not work. (it will try to load the LLM twice and you'll probably run out of VRAM.)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neuro LLM Server

READ THIS:

Info

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

kimjammer/Neuro-LLM-Server

Folders and files

Latest commit

History

Repository files navigation

Neuro LLM Server

READ THIS:

Info

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages