Skip to content

Conversation

HaiShaw
Copy link
Collaborator

@HaiShaw HaiShaw commented Dec 5, 2024

Motivation

Move FP8 layers definition to SGLang

Modifications

As it is.
Kernels come next.

Checklist

  • [+] Format your code according to the Contributor Guide.
  • [+] Add unit tests as outlined in the Contributor Guide.
  • [+] Update documentation as needed, including docstrings or example tutorials.

Copy link
Member

@zhyncs zhyncs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for vllm.model_executor.layers.quantization, LinearBase, and _custom_ops, everything else needs to be removed. Thanks!

per_tensor_dequantize,
requantize_with_max_scale,
)
from vllm.model_executor.parameter import ModelWeightParameter, PerTensorScaleParameter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still in use, will decouple and migrate later.

@HaiShaw HaiShaw closed this Dec 6, 2024
@zhyncs
Copy link
Member

zhyncs commented Dec 6, 2024

move to #2370
All credit goes to @HaiShaw Thanks!

@zhyncs zhyncs mentioned this pull request Dec 7, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants