Skip to content

[Feature] Get Input Token IDs and Output Token IDs with Engine.generate() #2634

@zhaochenyang20

Description

@zhaochenyang20

Checklist

Motivation

As what I am doing in openrlhf, sglang's output should to be re-tokenized:

            if backend == "vllm":
                for output, prompt in zip(outputs, all_prompts):
                    input_token_id_list.append(list(output.prompt_token_ids))
                    output_token_id_list.append(list(output.outputs[0].token_ids))
            else:
                for output, prompt in zip(outputs, all_prompts):
                    input_token_id_list.append(list(self.tokenizer(prompt)["input_ids"]))
                    output_token_id_list.append(list(self.tokenizer(output["text"])["input_ids"]))

I do not want to re-tokenize it again. We can try to align the output of vllm of sglang a bit. We need to add prompt_token_ids and token_ids in sglang. I think simply modify DetokenzierManager and the return part of TokenizerManager is enough.

https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/sglang/code-walk-through/readme.md

If additional help needed, please ping me in Slack.

Related resources

# vllm

In [6]: outputs[1]
Out[6]: RequestOutput(request_id=1, prompt='The president of the United States is', prompt_token_ids=[128000, 791, 4872, 315, 279, 3723, 4273, 374], encoder_prompt=None, encoder_prompt_token_ids=None, prompt_logprobs=None, outputs=[CompletionOutput(index=0, text=' the head of the executive branch of the federal government, and is the commander-in', token_ids=(279, 2010, 315, 279, 11145, 9046, 315, 279, 6918, 3109, 11, 323, 374, 279, 29094, 3502), cumulative_logprob=None, logprobs=None, finish_reason=length, stop_reason=None)], finished=True, metrics=RequestMetrics(arrival_time=1735422702.596864, last_token_time=1735422702.596864, first_scheduled_time=1735422702.598133, first_token_time=1735422702.6503043, time_in_queue=0.0012691020965576172, finished_time=1735422702.7723527, scheduler_time=0.00225832499563694, model_forward_time=None, model_execute_time=None), lora_request=None, num_cached_tokens=0)
# sglang

In [4]: outputs[1]
Out[4]: 
{'text': ' the most powerful person in the world. The president is elected by the people through the Electoral College system. The president serves as the head of state and the head of government for the United States.\nThe president is responsible for executing the laws of the land and ensuring that they are enforced fairly and equally. The president also has the power to veto laws passed by Congress, although Congress can override the veto with a two-thirds majority vote in both the House of Representatives and the Senate.\nThe president is also the commander-in-chief of the armed forces and has the authority to direct military operations and make decisions about national security. The president also has the power',
 'meta_info': {'id': 'bbfc9edbc0d945918250f399cf4dfce4',
  'finish_reason': {'type': 'length', 'length': 128},
  'prompt_tokens': 8,
  'completion_tokens': 128,
  'cached_tokens': 0}}

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions