-
Notifications
You must be signed in to change notification settings - Fork 1.5k
x64: convert movlhps
#10898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x64: convert movlhps
#10898
Conversation
;; movl %edx, %r10d | ||
;; movq 0x40(%rdi), %r11 | ||
;; vmovsd (%r11, %r10), %xmm7 | ||
;; vmovlhps %xmm7, %xmm0, %xmm0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note how we're gaining an extra vmovsd
here because we were probably doing some internal emit
-time matching to emit vmovhps
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like movhps
has the exact same encoding as movlhps
, which might be how this worked beforehand? Nevertheless from an assembler point of view I'd much rather match the Intel manual for sure.
Perhaps though XmmMem
could stay as an argument on some handwritten ISLE-helper and that delegates to movlhps
and movhps
as appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the more I stare at the manual, it could be that they are truly the same instruction and we should make that second operand an xmm_m64
. I suspect that the names were different for some historical reason and we'll just have to paper over the mnemonic difference with #10918 once that is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I ended up creating a separate instruction with the same encoding (avoids any customizations) and everything is fine. The only real user of vmovhps
is Winch, though. I looked through the five uses of x64_movlhps
in lower.isle
and, unless I'm missing some auto-conversion somewhere, it appears to me that we are fine using the movlhps
reg-to-reg form and don't really need movhps
in Cranelift.
@@ -143,8 +143,8 @@ | |||
;; movq %rsp, %rbp | |||
;; movl %edx, %r10d | |||
;; movq 0x40(%rdi), %r11 | |||
;; movdqu (%r11, %r10), %xmm6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old version seems wrong... this should be a movsd
I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah definitely wrong, XmmMem
auto-converted to Xmm
always loads 128-bits, and this is just an incorrect translation and means that on-the-boundary-of-linear-memory loads of f32/f64 will be incorrectly identified as out-of-bounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this, mind adding this test to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. It doesn't fail, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Maybe it would have failed with the old version?)
This change continues the push to convert all `mov`-related instruction and converts `movlhps` to use the new assembler (at least the SSE variant). In doing so, it discovers that we were incorrectly allowing an `XmmMem` as one of the operands; this is fixed to be `Xmm` only.
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
Subscribe to Label Action
This issue or pull request has been labeled: "cranelift", "cranelift:area:x64", "winch"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
prtest:full
This change continues the push to convert all
mov
-related instruction and convertsmovlhps
to use the new assembler (at least the SSE variant). In doing so, it discovers that we were incorrectly allowing anXmmMem
as one of the operands; this is fixed to beXmm
only.