Add support for OpenAI API compatible models #452

chenhunghan · 2023-09-20T11:35:34Z

Many popular open-sourced project offers OpenAI API compatible endpoints, for example

llama.cpp (41K ⭐)
oobabooga/text-generation-webui (23.9K ⭐)
LocalAI (11.2K ⭐)
FastChat (27.5K ⭐)
llama-cpp-python (3.3K ⭐)
ialacol (82 ⭐)

This PR adds OpenAI API endpoint compatibility, if merge, chat-ui can be used as the chat user interface for any project mentioned above. The official OpenAI endpoints are also support, user can quickly evaluate GPT3/GPT4 and/or any OSS models (GGUF/GPTQ...) hosted by for example llama.cpp.

The difference between this PR and #443 is

Model base selection not endpoint base, can have multiple models at the same time and select one via the dropdown.
Robust SSE event decoding via OpenAI node
Better support for system prompt, and all the OpenAI supported parameters.

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

julien-blanchon · 2023-09-20T13:01:44Z

Using

"stop": ["###"]

Could lead to some very bad behaviour, expecially with markdown

julien-blanchon · 2023-09-20T13:02:54Z

Using:

"userMessageToken": "### User:\n",
"userMessageEndToken": "\n",
"assistantMessageToken": "### Assistant:\n",

Could lead to easy prompt injection

julien-blanchon

Everything work fine ! However they are some "too" easy prompt injection

.env

README.md

julien-blanchon · 2023-09-20T12:16:55Z

src/lib/server/generateFromDefaultEndpoint.ts

@@ -4,6 +4,7 @@ import { trimSuffix } from "$lib/utils/trimSuffix";
 import { trimPrefix } from "$lib/utils/trimPrefix";
 import { PUBLIC_SEP_TOKEN } from "$lib/constants/publicSepToken";
 import { AwsClient } from "aws4fetch";
+import OpenAI from "openai";


In my PR I did try to avoid using the openai package as it was used only for a few type and that the rest api work well. In your implementation adding the package make more sense I think

src/lib/server/models.ts

README.md

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

gururise · 2023-09-22T01:49:38Z

Been trying this and with this config:

  {
    "name": "GPT-3.5 Turbo",
    "id": "gpt-3.5-turbo",
    "userMessageToken": "User:\n",
    "userMessageEndToken": "\n",
    "assistantMessageToken": "Assistant:\n",
    "preprompt": "You are a helpful assistant",
    "promptExamples": [],
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 1000,
      "max_new_tokens": 1024,
      "stop": []
    },
    "endpoints": [{
      "host" : "openai-compatible"
    }]
  }

inference seems to get stuck. Once I shut down the node server and restart it, it will run inference on the last text issued by the user:

If I wait long enough, I get this error:

Error: {"error":"Internal Server Error"}
    at Module.generateFromDefaultEndpoint (/home/gene/Downloads/chat-ui/src/lib/server/generateFromDefaultEndpoint.ts:129:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.summarize (/home/gene/Downloads/chat-ui/src/lib/server/summarize.ts:17:26)
    at async saveLast (/home/gene/Downloads/chat-ui/src/routes/conversation/[id]/+server.ts:174:24)

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

chenhunghan · 2023-09-22T05:51:30Z

"name": "GPT-3.5 Turbo"
...
If I wait long enough, I get this error:

Error: {"error":"Internal Server Error"}
    at Module.generateFromDefaultEndpoint (/home/gene/Downloads/chat-ui/src/lib/server/generateFromDefaultEndpoint.ts:129:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.summarize (/home/gene/Downloads/chat-ui/src/lib/server/summarize.ts:17:26)
    at async saveLast (/home/gene/Downloads/chat-ui/src/routes/conversation/[id]/+server.ts:174:24)

I tried the same setting, but can't reproduce, there seems exceptions when calling generateFromDefaultEndpoint, this was for generating a summary of the coversation, anyway, I added a try...catch there, would you mind try again? You will be able to see the real errors.

chenhunghan · 2023-09-22T05:54:42Z

@nsarrazin @philschmid @mishig25 PTAL.

gururise · 2023-09-23T00:45:12Z

"name": "GPT-3.5 Turbo"
...
If I wait long enough, I get this error:
Error: {"error":"Internal Server Error"}
    at Module.generateFromDefaultEndpoint (/home/gene/Downloads/chat-ui/src/lib/server/generateFromDefaultEndpoint.ts:129:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.summarize (/home/gene/Downloads/chat-ui/src/lib/server/summarize.ts:17:26)
    at async saveLast (/home/gene/Downloads/chat-ui/src/routes/conversation/[id]/+server.ts:174:24)
I tried the same setting, but can't reproduce, there seems exceptions when calling generateFromDefaultEndpoint, this was for generating a summary of the coversation, anyway, I added a try...catch there, would you mind try again? You will be able to see the real errors.

Sorry I haven't been able to replicate it.

gururise · 2023-09-23T05:40:08Z

Trying to use the LiteLLM-Proxy and Together.ai model:

{
    "name": "vicuna-1.5",
    "id": "together_ai/lmsys/vicuna-13b-v1.5-16k",
    "userMessageToken": "User:\n",
    "userMessageEndToken": "\n",
    "assistantMessageToken": "Assistant:\n",
    "preprompt": "You are a helpful assistant",
    "promptExamples": [],
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 1000,
      "max_new_tokens": 1024,
      "stop": []
    },
    "endpoints": [{
      "host" : "openai-compatible",
      "baseUrl": "http://localhost:8081",
      "type": "chat_completions"
    }]
  }

The LiteLLM endpoint (http://localhost:8081/chat/completions) never gets called, and I see this error message in the browser console:

TypeError: NetworkError when attempting to fetch resource. [+page.svelte:154:11](http://localhost:5173/src/routes/conversation/[id]/+page.svelte)
    writeMessage

krrishdholakia · 2023-09-30T23:00:53Z

is there anything blocking this PR from being merged?

Extremys · 2023-10-02T19:32:43Z

Hello I am using FastChat OpenAI compatible API as backend but the Ui seems to not work properly, it's getting stuck in generation mode:

cfg:

  {
    "name": "vicuna-13b-v1.5",
    "id": "vicuna-13b-v1.5",
    "userMessageToken": "User:\n",
    "userMessageEndToken": "\n",
    "assistantMessageToken": "Assistant:\n",
    "preprompt": "You are a helpful assistant",
    "promptExamples": [],
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 2000,
      "max_new_tokens": 2048,
      "stop": ["</s>"],
      "stream": true
    },
    "endpoints": [
  {
    "host": "openai-compatible",
    "apiKey": "{{ CHATUI_TOKEN }}",
    "baseURL": "https://myopenai-api/v1",
    "type": "chat_completions",
    "weight": 1
  }

Any idea?

shagunhexo · 2023-10-24T12:15:39Z

is there anything blocking this PR from being merged?

chenhunghan · 2023-10-24T18:37:41Z

I have not received any comments from the maintainers for over a month, it seems they have no interest or there are conflicts of interests to support OpenAI-styled API. For those interests in using chat-ui with OpenAI-styled API, please follow the forked version that tried to sync with the upstream . https://github.com/ialacol/chat-ui/tree/main

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

julien-blanchon · 2023-10-24T20:23:37Z

Humm I don't think they have any conflict of interest ^^. I think they just lacking of time to review it or maybe just forgot. For me this is looking fine, it's okay to merge.

nsarrazin · 2023-10-24T23:03:46Z

Hey guys! No conflicts of interest, as far as I know we're pretty happy to make chat-ui backend agnostic. I just wanted to take the time to refactor things a bit so that we could add more backends easily in the future, in a way that scales. I wanted to use dynamic imports so that not everyone needs to install the openai or aws4fetch packages if they're not going to use those backends.

I just didn't have the time to design a common backend API I was happy with, but seems like it is a pressing issue for some so I'll have a look soon 😁

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

nsarrazin · 2023-11-01T16:26:35Z

Hi everyone! Quick update on the support for openAI type endpoints. I finished my refactoring on the pr #541, I tested it using the OpenAI API and it worked well.

I'll be testing it more thoroughly and updating the docs but feel free to do it as well with your locally hosted APIs to let me know if it works 😄

Thanks @chenhunghan for the great work on this!

chenhunghan and others added 10 commits September 19, 2023 15:28

Fix the response

ca59222

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Should use /completions

484aa65

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Use async generator

b26989b

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Use openai npm

cc5a5b3

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Fix generateFromDefaultEndpoint

5d7b6a1

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Fix last char become undefined

1574fd5

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Better support for system prompt

a5bb367

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Updates

abdcc2b

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Merge branch 'huggingface:main' into main

2fd74ad

Revert

7542584

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

chenhunghan marked this pull request as ready for review September 20, 2023 11:41

chenhunghan mentioned this pull request Sep 20, 2023

Add support for OpenAI API #443

Closed

3 tasks

julien-blanchon reviewed Sep 20, 2023

View reviewed changes

chenhunghan and others added 7 commits September 21, 2023 09:10

Update README

6c1cb50

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Default system prompt

7284ec6

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

remove sk-

d950966

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Merge branch 'main' into main

f4dabb8

Fixing types

bebcce1

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Fix lockfile

58d4f92

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Move .optional

9132568

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

julien-blanchon approved these changes Sep 21, 2023

View reviewed changes

julien-blanchon mentioned this pull request Sep 21, 2023

Support text-generation-webui #316

Open

Add try...catch and controller.error(error)

fac7113

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

nsarrazin added enhancement New feature or request back This issue is related to the Svelte backend or the DB labels Oct 3, 2023

Merge branch 'main' into main

0ec5862

chenhunghan and others added 2 commits October 24, 2023 22:33

Merge branch 'main' into main

084ef9a

Format

2220033

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

chenhunghan added 4 commits October 26, 2023 12:40

Fix types

5cfdab4

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Fix again

cac6de8

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Better error message

b29a10d

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Update README

c30233a

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

nsarrazin mentioned this pull request Nov 2, 2023

Modular backends & support for openAI & AWS endpoints #541

Merged

chenhunghan closed this Nov 2, 2023

Add support for OpenAI API compatible models #452

Add support for OpenAI API compatible models #452

Uh oh!

Conversation

chenhunghan commented Sep 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

julien-blanchon commented Sep 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

julien-blanchon commented Sep 20, 2023

Uh oh!

julien-blanchon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

julien-blanchon Sep 20, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gururise commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chenhunghan commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chenhunghan commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gururise commented Sep 23, 2023

Uh oh!

gururise commented Sep 23, 2023

Uh oh!

krrishdholakia commented Sep 30, 2023

Uh oh!

Extremys commented Oct 2, 2023

Uh oh!

shagunhexo commented Oct 24, 2023

Uh oh!

chenhunghan commented Oct 24, 2023

Uh oh!

julien-blanchon commented Oct 24, 2023

Uh oh!

nsarrazin commented Oct 24, 2023

Uh oh!

nsarrazin commented Nov 1, 2023

Uh oh!

Uh oh!

chenhunghan commented Sep 20, 2023 •

edited

Loading

julien-blanchon commented Sep 20, 2023 •

edited

Loading

gururise commented Sep 22, 2023 •

edited

Loading

chenhunghan commented Sep 22, 2023 •

edited

Loading

chenhunghan commented Sep 22, 2023 •

edited

Loading