Skip to content

[Feature, Hardware] add support for Ascend NPU #3781

@22dimensions

Description

@22dimensions

Checklist

Description

Ascend is a full-stack AI computing infrastructure for industry applications and services based on Huawei Ascend processors and software. For more information about Ascend, see Ascend Community.

Pytorch has officially announced support for Ascend NPU (through key PrivateUse1), please see the PrivateUse1 tutorial here.

Motivation

Currently, the number of developers using Ascend NPU for AI training and inferencing has been significantly increasing. And many popular open-source projects have already supported Ascend, such as, LLaMA-Factory, llama.cpp, DeepSpeed. Some users of sglang want to run it on Ascend (see #3609). Therefore, I would like to add support for Ascend NPU backend for sglang.

Status

Pytorch already support npu, but OpenAI Triton doesn't support npu for now which is under development. It should works with the torch_native attention backend. When triton is ready, it should works with the triton backend.

We have successfully run sglang on the x86 Ascend platform with torch_native backend. Here is the running log:

Image

Related PR

sglang

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions