-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ARM]add armv7 fp16 op conv5x5s1 conv5x5s2 conv3x3s2_direct #8124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for your contribution! |
#define INIT_FIRST \ | ||
"2:\n" \ | ||
"vld1.16 {d10-d13}, [%[wc0]]! @ load w0, w1\n" \ | ||
"vld1.16 {d14-d15}, [%[wc0]]! @ load w2\n" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这两个指令可以插入几条其他指令,减少冲突
#define INIT \ | ||
"2:\n" \ | ||
"vld1.16 {d10-d13}, [%[wc0]]! @ load w0, w1\n" \ | ||
"vld1.16 {d14-d15}, [%[wc0]]! @ load w2\n" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
"vld1.16 {d10-d13}, [%[wc0]]! @ load w0, w1\n" \ | ||
"vld1.16 {d14-d15}, [%[wc0]]! @ load w2\n" \ | ||
"vld1.16 {d16-d19}, [%[ptr_out0]]! @ load outr0\n" \ | ||
"vld1.16 {d20-d23}, [%[ptr_out0]] @ load outr0\n" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
"vmla.f16 q14, q7, d4[2] @ w0 * inr24\n" \ | ||
"vmla.f16 q15, q7, d5[0] @ w0 * inr26\n" \ | ||
"vld1.16 {d10-d13}, [%[wc0]]! @ load w5, to q7\n" /* mul r1, with*/ \ | ||
"vld1.16 {d14-d15}, [%[wc0]]! @ load w5, to q7\n" /* mul r1, with*/ \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
|
||
#define COMPUTE \ | ||
"vld1.16 {d24-d25}, [%[bias]] \n" /* load bias to out00 */ \ | ||
"vld1.16 {d0-d3}, [%[wc0]]! \n" /* load w0-w1 */ \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
3×3 s2 direct
5×5 depthwise