-
Notifications
You must be signed in to change notification settings - Fork 354
Closed
Labels
Description
A rather simple pause-loop kernel is classified properly as Core Bound at levels 1 & 2 and so in levels 5 and 6, but not at the mid-levels 4 and 5.
I am documenting this here and will address it in TMA 4.2 release.
Here is a reproducer with perf-tools.
P.S. @andikleen: This is a CFL machine (8th gen Core). Can that be reflected instead of [skl]
in 1st line of toplev output?
$ ./kernels/gen-kernel.py -i pause > ./kernels/pause3x.c
$ gcc -g -O2 -o ./kernels/pause3x ./kernels/pause3x.c
$ ./pmu-tools/toplev.py --no-desc --no-perf --nodes '+CoreIPC,+UPI,+Time,+MUX' -vl6 -- ./kernels/pause3x 10000000 2>&1 | egrep -v ' [10]\.. '
# 4.11-full-perf on Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz [skl]
BE Backend_Bound % Slots 98.9
BE/Core Backend_Bound.Core_Bound % Slots 98.8 <==
FE Frontend_Bound.Fetch_Latency.MS_Switches % Clocks 2.6 <
BE/Core Backend_Bound.Core_Bound.Ports_Utilization % Clocks 9.9 <
BE/Core Backend_Bound.Core_Bound.Ports_Utilization.Ports_Utilized_0 % Clocks 8.5 <
BE/Core Backend_Bound.Core_Bound.Ports_Utilization.Ports_Utilized_0.Serializing_Operation % Clocks 98.9 <
BE/Core Backend_Bound.Core_Bound.Ports_Utilization.Ports_Utilized_0.Serializing_Operation.Slow_Pause % Clocks 91.3 <
Info.Thread UPI Metric 2.9
RET Retiring.Light_Operations.Other % Uops 100.0 <
MUX % 2.2
Using level 6.
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-5
Off-line CPU(s) list: 6-11
Thread(s) per core: 1
Core(s) per socket: 6
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
Stepping: 10
CPU MHz: 3701.500
CPU max MHz: 3700.0000
CPU min MHz: 800.0000
BogoMIPS: 7399.70
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 12288K