-
Notifications
You must be signed in to change notification settings - Fork 226
Closed
Labels
bugsome feature is brokensome feature is broken
Description
OpenSpades`spades::client::Corpse::Spring:
...
0x100620aa0 <+400>: mulss %xmm6, %xmm2
0x100620aa4 <+404>: mulss %xmm2, %xmm0
0x100620aa8 <+408>: mulss %xmm7, %xmm0
0x100620aac <+412>: movss (%rbx), %xmm2 ; xmm2 = mem[0],zero,zero,zero
0x100620ab0 <+416>: addss %xmm0, %xmm2
0x100620ab4 <+420>: movss %xmm2, (%rbx)
0x100620ab8 <+424>: movb (%rcx), %al
0x100620aba <+426>: testb %al, %al
0x100620abc <+428>: jne 0x100620e78 ; <+1384> [inlined] spades::Vector3::operator+=(spades::Vector3 const&) at Corpse.cpp:124
0x100620ac2 <+434>: movq %rbx, -0x38(%rbp)
0x100620ac6 <+438>: movb (%rdx), %al
0x100620ac8 <+440>: testb %al, %al
0x100620aca <+442>: jne 0x100620e95 ; <+1413> [inlined] spades::Vector3::operator+=(spades::Vector3 const&) + 29 at Corpse.cpp:124
0x100620ad0 <+448>: movaps %xmm5, %xmm3
-> 0x100620ad3 <+451>: divps %xmm4, %xmm3
0x100620ad6 <+454>: movaps %xmm0, %xmm2
0x100620ad9 <+457>: divss %xmm7, %xmm2
(lldb) po (float __attribute__((ext_vector_type(4)))) $xmm4
(0.162500009, 0.162500009, 0, 0)
(lldb) po (float __attribute__((ext_vector_type(4)))) $xmm3
(0.00000334322158, 0.000603287306, 0, 0)
(lldb) po/x $mxcsr
0x00001f21
Apparently, LLVM's SLP vectorizer assumes that all floating point exceptions are masked (thus FP instructions never signal), which is the default setting, in order to take advantage of SIMD instructions even if the number of elements is fewer than the native SIMD width. However it isn't the case in this particular instance.
Curiously, when I inspect the value of mxcsr
of every thread, only the threads started by GlobalDispatchThreadPool
have its _MM_MASK_INVALID
(0x80
) bit cleared.
Confirmed on macOS 10.14 Mojave
Metadata
Metadata
Assignees
Labels
bugsome feature is brokensome feature is broken