Skip to content

新 IR 下自动代码生成 #56849

@0x45f

Description

@0x45f

问题描述 Please describe your issue

0 前言

静态图组网

import paddle
import paddle.static as static

paddle.enable_static()

main_program = static.Program()
startup_program = static.Program()
with static.program_guard(main_program=main_program, startup_program=startup_program):
    x = paddle.static.data('x', [2, 2], dtype='float32')
    out = paddle.mean(x, 0, True)    
    print(out)
    print(main_program)


# var mean_0.tmp_0 : LOD_TENSOR.shape(1, 2).dtype(float32).stop_gradient(True)
# { // block 0
#     var x : LOD_TENSOR.shape(2, 2).dtype(float32).stop_gradient(True)
#     var mean_0.tmp_0 : LOD_TENSOR.shape(1, 2).dtype(float32).stop_gradient(True)

#     {Out=['mean_0.tmp_0']} = reduce_mean(inputs={X=['x']}, dim = [0], in_dtype = -1, keep_dim = True, op_device = , op_namescope = /, op_role = 0, op_role_var = [], out_dtype = -1, reduce_all = False, with_quant_attr = False)
# }

新IR下,只需要在执行时设置环境变量FLAGS_enable_new_ir_api=1就会走新IR的逻辑

# <paddle.fluid.libpaddle.ir.OpResult object at 0x7f166a12bb70>
# {
#  (%0) = "pd.data" () {dtype:float32,name:x,place:Place(undefined:0),shape:IntArray[2,2]} : () -> pd.tensor<2x2xf32>
#  (%1) = "pd.mean" (%0) {axis:IntArray[0],keepdim:1} : (pd.tensor<2x2xf32>) -> pd.tensor<1x2xf32>
# }

1 调用逻辑

  • 以mean api为例,_ir_ops.mean(x, axis, keepdim)处是新IR组网的逻辑
    image

  • /workspace/Paddle/python/paddle/_ir_ops.py文件中,将core.ir.ops中的方法加入到了globals和__all__中
    image

  • core.ir.ops是pybind的module,具体逻辑在/workspace/Paddle/paddle/fluid/pybind/ir.cc中。调用了BindOpsAPI(&ops_modules)方法将C++的函数bind到了ops module上
    image

  • BindOpsAPI的逻辑在/workspace/Paddle/paddle/fluid/pybind/ops_api.cc中,调用了PyModule_AddFunctions python c api进行bind
    image

  • 上图中的OpsAPI是这样一组函数的声明,需要按照PyModule_AddFunctions函数的要求进行声明
    image

  • 以mean为例,只是做了一下转发,调用了static_api_mean函数
    image
    image

  • static_api_mean在/workspace/Paddle/paddle/fluid/pybind/static_op_function.cc文件中中定义。这里对python端传来的参数进行转换,最后调用了paddle::dialect::mean(x, axis, keepdim)
    image

  • paddle::dialect::mean定义在/workspace/Paddle/build/paddle/fluid/ir/dialect/paddle_dialect/ir/pd_api.cc文件中,调用了ir::Builder的Build()函数,进行静态图组网
    image

  • Build()中会调用OpTy::Build,然后将op insert到block中
    image
    image
    image

  • OpTy::Build就是具体的Op的build逻辑,在/workspace/Paddle/build/paddle/fluid/ir/dialect/paddle_dialect/ir/pd_op.cc中,会调用infermeta的一些逻辑
    image

2 生成逻辑

文件 生成逻辑
/workspace/Paddle/paddle/fluid/pybind/ops_api.cc 通过/workspace/Paddle/paddle/fluid/ir/dialect/op_generator/ops_api_gen.py生成
/workspace/Paddle/paddle/fluid/pybind/static_op_function.cc 通过/workspace/Paddle/paddle/fluid/ir/dialect/op_generator/python_c_gen.py进行生成
/workspace/Paddle/build/paddle/fluid/ir/dialect/paddle_dialect/ir/pd_api.cc 通过/workspace/Paddle/paddle/fluid/ir/dialect/op_generator/api_gen.py进行生成
/workspace/Paddle/build/paddle/fluid/ir/dialect/paddle_dialect/ir/pd_op.cc 通过/workspace/Paddle/paddle/fluid/ir/dialect/op_generator/op_gen.py生成

生成的逻辑都是类似的,以生成pd_api.cc的api_gen.py为例

  • api_gen.py中定义了很多个模板,生成时就是往这些模板里填充具体的代码
    image
    image

  • 比如API_IMPL_TEMPLATE
    image

  • api的定义,从哪里获取的?从op定义的yaml文件中读取的,主要的是右边的6个yaml文件
    /workspace/Paddle/paddle/fluid/operators/generator/parsed_ops/ops.parsed.yaml,
    /workspace/Paddle/paddle/fluid/operators/generator/parsed_ops/legacy_ops.parsed.yaml,
    /workspace/Paddle/paddle/fluid/operators/generator/parsed_ops/backward_ops.parsed.yaml,
    /workspace/Paddle/paddle/fluid/operators/generator/parsed_ops/legacy_backward_ops.parsed.yaml,
    /workspace/Paddle/paddle/fluid/ir/dialect/pd_op.yaml
    /workspace/Paddle/paddle/phi/api/yaml/op_compat.yaml

  • 比如说mean,yaml中定义了mean的输入输出类型,个数等信息
    image

  • 自动生成在什么时候执行?在/workspace/Paddle/paddle/fluid/ir/dialect/paddle_dialect/ir/CMakeLists.txt中有如下的cmake命令,在cmake阶段会执行红框中的python命令,也就是调用api_gen.py文件进行代码生成
    image

3 开源任务

issue:#55737
后续应该会有python api修改以及单测修改验证的开源任务,但是目前机制上还没有完全合入,所以就大概介绍一下可能的工作

  • 当前python端接口形式
def mean(x, axis=None, keepdim=False, name=None)
    if in_dynamic_mode():
        return _C_ops.mean(x, axis, keepdim)
    else:
        if ir.core._use_new_ir_api():
            return _ir_ops.mean(x, axis, keepdim)
        # 下面是原来静态图的逻辑
        reduce_all, axis = _get_reduce_axis_with_tensor(axis, x)
        check_variable_and_dtype(x)
  • 对于新IR下动静态图可以统一的api,统一为下面的形式
def mean(x, axis=None, keepdim=False, name=None)
    if in_dygraph_or_new_ir_mode():
        return _C_ops.mean(x, axis, keepdim)
    else:
        # 原来静态图的逻辑
        reduce_all, axis = _get_reduce_axis_with_tensor(axis, x)
        check_variable_and_dtype(x)
  • 对于新IR下动静态图无法统一的api,修改为下面的形式
def special_api(x, axis=None, keepdim=False, name=None)
    if in_dynamic_mode():
        dygraph_call(xxxx)
        return _C_ops.mean(x, axis, keepdim)
    elif in_new_ir_mode():
        static_call(xxxx)
        return _C_ops.mean(x, axis, keepdim)
    else:
        # 原来静态图的逻辑
        check_variable_and_dtype(x)
  • /workspace/Paddle/test/legacy_test/test_mean_op.py单测中有self.python_api属性,设置了这个属性之后会自动去跑api的测试,等我们新IR的测试机制合入之后就会自动跑到新IR的测试。可能有的单测中没有设置self.python_api,需要手动添加属性

image

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions