Skip to content

Conversation

chenhunghan
Copy link
Contributor

@chenhunghan chenhunghan commented Sep 20, 2023

Many popular open-sourced project offers OpenAI API compatible endpoints, for example

This PR adds OpenAI API endpoint compatibility, if merge, chat-ui can be used as the chat user interface for any project mentioned above. The official OpenAI endpoints are also support, user can quickly evaluate GPT3/GPT4 and/or any OSS models (GGUF/GPTQ...) hosted by for example llama.cpp.

The difference between this PR and #443 is

  • Model base selection not endpoint base, can have multiple models at the same time and select one via the dropdown.
  • Robust SSE event decoding via OpenAI node
  • Better support for system prompt, and all the OpenAI supported parameters.

chenhunghan and others added 10 commits September 19, 2023 15:28
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
@chenhunghan chenhunghan marked this pull request as ready for review September 20, 2023 11:41
@chenhunghan chenhunghan mentioned this pull request Sep 20, 2023
3 tasks
@julien-blanchon
Copy link
Contributor

julien-blanchon commented Sep 20, 2023

Using

"stop": ["###"]

Could lead to some very bad behaviour, expecially with markdown

Screenshot 2023-09-20 at 15 03 02

@julien-blanchon
Copy link
Contributor

Using:

"userMessageToken": "### User:\n",
"userMessageEndToken": "\n",
"assistantMessageToken": "### Assistant:\n",

Could lead to easy prompt injection

Screenshot 2023-09-20 at 15 02 17

Copy link
Contributor

@julien-blanchon julien-blanchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything work fine ! However they are some "too" easy prompt injection

@@ -4,6 +4,7 @@ import { trimSuffix } from "$lib/utils/trimSuffix";
import { trimPrefix } from "$lib/utils/trimPrefix";
import { PUBLIC_SEP_TOKEN } from "$lib/constants/publicSepToken";
import { AwsClient } from "aws4fetch";
import OpenAI from "openai";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my PR I did try to avoid using the openai package as it was used only for a few type and that the rest api work well. In your implementation adding the package make more sense I think

chenhunghan and others added 7 commits September 21, 2023 09:10
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
@gururise
Copy link
Contributor

gururise commented Sep 22, 2023

Been trying this and with this config:

  {
    "name": "GPT-3.5 Turbo",
    "id": "gpt-3.5-turbo",
    "userMessageToken": "User:\n",
    "userMessageEndToken": "\n",
    "assistantMessageToken": "Assistant:\n",
    "preprompt": "You are a helpful assistant",
    "promptExamples": [],
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 1000,
      "max_new_tokens": 1024,
      "stop": []
    },
    "endpoints": [{
      "host" : "openai-compatible"
    }]
  }

inference seems to get stuck. Once I shut down the node server and restart it, it will run inference on the last text issued by the user:

image

If I wait long enough, I get this error:

Error: {"error":"Internal Server Error"}
    at Module.generateFromDefaultEndpoint (/home/gene/Downloads/chat-ui/src/lib/server/generateFromDefaultEndpoint.ts:129:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.summarize (/home/gene/Downloads/chat-ui/src/lib/server/summarize.ts:17:26)
    at async saveLast (/home/gene/Downloads/chat-ui/src/routes/conversation/[id]/+server.ts:174:24)

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
@chenhunghan
Copy link
Contributor Author

chenhunghan commented Sep 22, 2023

"name": "GPT-3.5 Turbo"
...
If I wait long enough, I get this error:

Error: {"error":"Internal Server Error"}
    at Module.generateFromDefaultEndpoint (/home/gene/Downloads/chat-ui/src/lib/server/generateFromDefaultEndpoint.ts:129:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.summarize (/home/gene/Downloads/chat-ui/src/lib/server/summarize.ts:17:26)
    at async saveLast (/home/gene/Downloads/chat-ui/src/routes/conversation/[id]/+server.ts:174:24)

I tried the same setting, but can't reproduce, there seems exceptions when calling generateFromDefaultEndpoint, this was for generating a summary of the coversation, anyway, I added a try...catch there, would you mind try again? You will be able to see the real errors.

@chenhunghan
Copy link
Contributor Author

chenhunghan commented Sep 22, 2023

@nsarrazin @philschmid @mishig25 PTAL.

@gururise
Copy link
Contributor

"name": "GPT-3.5 Turbo"
...
If I wait long enough, I get this error:

Error: {"error":"Internal Server Error"}
    at Module.generateFromDefaultEndpoint (/home/gene/Downloads/chat-ui/src/lib/server/generateFromDefaultEndpoint.ts:129:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.summarize (/home/gene/Downloads/chat-ui/src/lib/server/summarize.ts:17:26)
    at async saveLast (/home/gene/Downloads/chat-ui/src/routes/conversation/[id]/+server.ts:174:24)

I tried the same setting, but can't reproduce, there seems exceptions when calling generateFromDefaultEndpoint, this was for generating a summary of the coversation, anyway, I added a try...catch there, would you mind try again? You will be able to see the real errors.

Sorry I haven't been able to replicate it.

@gururise
Copy link
Contributor

Trying to use the LiteLLM-Proxy and Together.ai model:

{
    "name": "vicuna-1.5",
    "id": "together_ai/lmsys/vicuna-13b-v1.5-16k",
    "userMessageToken": "User:\n",
    "userMessageEndToken": "\n",
    "assistantMessageToken": "Assistant:\n",
    "preprompt": "You are a helpful assistant",
    "promptExamples": [],
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 1000,
      "max_new_tokens": 1024,
      "stop": []
    },
    "endpoints": [{
      "host" : "openai-compatible",
      "baseUrl": "http://localhost:8081",
      "type": "chat_completions"
    }]
  }

The LiteLLM endpoint (http://localhost:8081/chat/completions) never gets called, and I see this error message in the browser console:

TypeError: NetworkError when attempting to fetch resource. [+page.svelte:154:11](http://localhost:5173/src/routes/conversation/[id]/+page.svelte)
    writeMessage

@krrishdholakia
Copy link

is there anything blocking this PR from being merged?

@Extremys
Copy link

Extremys commented Oct 2, 2023

Hello I am using FastChat OpenAI compatible API as backend but the Ui seems to not work properly, it's getting stuck in generation mode:

cfg:

  {
    "name": "vicuna-13b-v1.5",
    "id": "vicuna-13b-v1.5",
    "userMessageToken": "User:\n",
    "userMessageEndToken": "\n",
    "assistantMessageToken": "Assistant:\n",
    "preprompt": "You are a helpful assistant",
    "promptExamples": [],
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 2000,
      "max_new_tokens": 2048,
      "stop": ["</s>"],
      "stream": true
    },
    "endpoints": [
  {
    "host": "openai-compatible",
    "apiKey": "{{ CHATUI_TOKEN }}",
    "baseURL": "https://myopenai-api/v1",
    "type": "chat_completions",
    "weight": 1
  }
image

Any idea?

@nsarrazin nsarrazin added enhancement New feature or request back This issue is related to the Svelte backend or the DB labels Oct 3, 2023
@shagunhexo
Copy link

is there anything blocking this PR from being merged?

@chenhunghan
Copy link
Contributor Author

I have not received any comments from the maintainers for over a month, it seems they have no interest or there are conflicts of interests to support OpenAI-styled API. For those interests in using chat-ui with OpenAI-styled API, please follow the forked version that tried to sync with the upstream . https://github.com/ialacol/chat-ui/tree/main

chenhunghan and others added 2 commits October 24, 2023 22:33
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
@julien-blanchon
Copy link
Contributor

Humm I don't think they have any conflict of interest ^^. I think they just lacking of time to review it or maybe just forgot. For me this is looking fine, it's okay to merge.

@nsarrazin
Copy link
Contributor

Hey guys! No conflicts of interest, as far as I know we're pretty happy to make chat-ui backend agnostic. I just wanted to take the time to refactor things a bit so that we could add more backends easily in the future, in a way that scales. I wanted to use dynamic imports so that not everyone needs to install the openai or aws4fetch packages if they're not going to use those backends.

I just didn't have the time to design a common backend API I was happy with, but seems like it is a pressing issue for some so I'll have a look soon 😁

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>
@nsarrazin
Copy link
Contributor

Hi everyone! Quick update on the support for openAI type endpoints. I finished my refactoring on the pr #541, I tested it using the OpenAI API and it worked well.

I'll be testing it more thoroughly and updating the docs but feel free to do it as well with your locally hosted APIs to let me know if it works 😄

Thanks @chenhunghan for the great work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back This issue is related to the Svelte backend or the DB enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants