Skip to content

llama.vim: filter server response fields #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jan 15, 2025

Conversation

VJHack
Copy link
Collaborator

@VJHack VJHack commented Jan 14, 2025

As of 10940, we are now able to serialize only relavent data.

A suggestion was made here to filter out unnecessary fields in the response. This feature ensures that we are transporting the most minimal response from the server to the client.

There are two places where this change was made:

  1. When we make the curl request in ring_update() to asyncronously process extra_context, we filter out all response fields since we don't take any action on the response.
  2. These are the response fields we want to include when making the main fim call
'response_fields':  [ 
                     "content",
                     "timings/prompt_n", 
                     "timings/prompt_ms", 
                     "timings/prompt_per_token_ms",
                     "timings/prompt_per_second",
                     "timings/predicted_n", 
                     "timings/predicted_ms", 
                     "timings/predicted_per_token_ms",
                     "timings/predicted_per_second",
                     "truncated",
                     "tokens_cached",
                     "generation_settings/n_ctx",
                     ], 

@VJHack VJHack requested a review from ggerganov January 14, 2025 17:30
formatting comma

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

let l:n_cached = get(l:response, 'tokens_cached', 0)
let l:truncated = get(l:response, 'truncated', v:false)
let l:n_ctx = get(l:response, 'generation_settings/n_ctx', 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After recent refactoring, the server no longer provides the n_ctx field in the responses. We should remove it from the llama.vim client and instead of displaying:

c: 1234 / 0 | ...

We should simply display:

c: 1234 | ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did notice that. I removed all instances of n_ctx field and adjusted the info message accordingly as shown below.
Screen Shot 2025-01-14 at 3 22 19 PM

Thank you!

@ggerganov ggerganov merged commit 9f5cadd into ggml-org:master Jan 15, 2025
ggerganov added a commit that referenced this pull request Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants