-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Intergrate MultiThreadedWorkQueue to execute program ops #35356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有个想法,但暂时没必要修:感觉多线程包裹的层数太多了,从InterpreterCore,到AsyncWorkQueue到WorkQueueGroup到SinggleThread和MultiThread,已经很难看出我们的设计(单线程跑LaunchKernel,多线程跑CPU OP)了。
|
||
InterpreterCoreGarbageCollector gc_; | ||
std::vector<paddle::platform::DeviceEvent> gc_event_; | ||
std::unique_ptr<WorkQueueGroup> group_thread_pool_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
group_thread_pool_是不是不需要了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我后面提PR 改掉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…e#35356) * format code * format interface * polish interface * Remove std::memory_order * modify into SpinLock * remove fetch_context_pool_ * fix comment * modify into WorkQueueGroup * refine code * fix pointer * fix paddle_enforce * split into AsyncWorkQueue * polish code * specify std::memory_relax * fix atomic fetch_sub * fix num_thread
PR types
New features
PR changes
Others
Describe
Intergrate MultiThreadedWorkQueue to execute program op
从PTB LM 模型 profile 结果来看(007机器,V100 16G):
652e655fe0a7df956
)(依赖组件)
(依赖组件)
what's next?
NextInstruction.all_next_ops
.