Skip to content

Conversation

SigureMo
Copy link
Member

@SigureMo SigureMo commented Jun 4, 2025

PR Category

Execute Infrastructure

PR Types

New features

Description

添加一个 hook 用来允许替换掉 run_program_op 逻辑,为动转静流程中插入 cuda graph 实现做准备

初步设计如下:

  • 前几轮 warmup,将 run_impl 替换为 capture 实现,用于收集动态 shape 的 CUDA Graph,以 batch size 为 cache key,存入 graph cache
  • 后续将 run_impl 替换为 replay 实现,查找 cache,调用 CUDA Graph 的 replay

Copy link

paddle-bot bot commented Jun 4, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@SigureMo SigureMo merged commit a519ae3 into develop Jun 12, 2025
48 of 51 checks passed
@SigureMo SigureMo deleted the dy2st/add-a-hook-to-replace-run-impl-of-partial-program-layer branch June 12, 2025 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants