-
-
Notifications
You must be signed in to change notification settings - Fork 768
Description
Given that the topic of optimizing binary size came up in #1660, I want to mention that I recently observed what seems to be another missed optimization opportunity.
Looking at the nRF52840-DK's panic handler code, a const
LED is chosen as a blinking signal, and therefore the corresponding GPIO is obtained from nrf52840::gpio::PORT
. The LED index is compile-time, so I was expecting the array indexing to be optimized away.
tock/boards/nordic/nrf52840dk/src/io.rs
Lines 89 to 102 in fbc863f
pub unsafe extern "C" fn panic_fmt(pi: &PanicInfo) -> ! { | |
// The nRF52840DK LEDs (see back of board) | |
const LED1_PIN: Pin = Pin::P0_13; | |
let led = &mut led::LedLow::new(&mut nrf52840::gpio::PORT[LED1_PIN]); | |
let writer = &mut WRITER; | |
debug::panic( | |
&mut [led], | |
writer, | |
pi, | |
&cortexm4::support::nop, | |
&PROCESSES, | |
&CHIP, | |
) | |
} |
However, the disassembled code is the following:
00007d4e <core::panicking::panic_fmt>:
7d4e: b580 push {r7, lr}
7d50: 466f mov r7, sp
7d52: b084 sub sp, #16
7d54: e9cd 0102 strd r0, r1, [sp, #8]
7d58: f646 1074 movw r0, #26996 ; 0x6974
7d5c: f2c0 0001 movt r0, #1
7d60: 9001 str r0, [sp, #4]
7d62: f24c 2060 movw r0, #49760 ; 0xc260
7d66: f2c0 0001 movt r0, #1
7d6a: 9000 str r0, [sp, #0]
7d6c: 4668 mov r0, sp
7d6e: f006 fe35 bl e9dc <rust_begin_unwind>
7d72: defe udf #254 ; 0xfe
0000e9dc <rust_begin_unwind>:
e9dc: b5f8 push {r3, r4, r5, r6, r7, lr}
e9de: af04 add r7, sp, #16
e9e0: 4604 mov r4, r0
e9e2: f006 fd3b bl 1545c <<nrf5x::gpio::Port as core::ops::index::IndexMut<nrf5x::gpio::Pin>>::index_mut>
e9e6: 4905 ldr r1, [pc, #20] ; (e9fc <rust_begin_unwind+0x20>)
e9e8: e9cd 0101 strd r0, r1, [sp, #4]
e9ec: a801 add r0, sp, #4
e9ee: 9003 str r0, [sp, #12]
e9f0: a803 add r0, sp, #12
e9f2: 4621 mov r1, r4
e9f4: f003 ff44 bl 12880 <kernel::debug::panic>
e9f8: defe udf #254 ; 0xfe
e9fa: bf00 nop
e9fc: 0001b5b0 .word 0x0001b5b0
In particular, I wasn't expecting to see the following instruction in rust_begin_unwind
: <<nrf5x::gpio::Port as core::ops::index::IndexMut<nrf5x::gpio::Pin>>::index_mut>
.
It turns out that the definition of PORT
is static mut
.
tock/chips/nrf52840/src/gpio.rs
Lines 54 to 56 in fbc863f
pub static mut PORT: Port = Port { | |
pins: unsafe { &mut PINS }, | |
}; |
Same goes for the PIN
array.
tock/chips/nrf52840/src/gpio.rs
Lines 3 to 52 in fbc863f
static mut PINS: [GPIOPin; 48] = [ | |
GPIOPin::new(Pin::P0_00), | |
GPIOPin::new(Pin::P0_01), | |
GPIOPin::new(Pin::P0_02), | |
GPIOPin::new(Pin::P0_03), | |
GPIOPin::new(Pin::P0_04), | |
GPIOPin::new(Pin::P0_05), | |
GPIOPin::new(Pin::P0_06), | |
GPIOPin::new(Pin::P0_07), | |
GPIOPin::new(Pin::P0_08), | |
GPIOPin::new(Pin::P0_09), | |
GPIOPin::new(Pin::P0_10), | |
GPIOPin::new(Pin::P0_11), | |
GPIOPin::new(Pin::P0_12), | |
GPIOPin::new(Pin::P0_13), | |
GPIOPin::new(Pin::P0_14), | |
GPIOPin::new(Pin::P0_15), | |
GPIOPin::new(Pin::P0_16), | |
GPIOPin::new(Pin::P0_17), | |
GPIOPin::new(Pin::P0_18), | |
GPIOPin::new(Pin::P0_19), | |
GPIOPin::new(Pin::P0_20), | |
GPIOPin::new(Pin::P0_21), | |
GPIOPin::new(Pin::P0_22), | |
GPIOPin::new(Pin::P0_23), | |
GPIOPin::new(Pin::P0_24), | |
GPIOPin::new(Pin::P0_25), | |
GPIOPin::new(Pin::P0_26), | |
GPIOPin::new(Pin::P0_27), | |
GPIOPin::new(Pin::P0_28), | |
GPIOPin::new(Pin::P0_29), | |
GPIOPin::new(Pin::P0_30), | |
GPIOPin::new(Pin::P0_31), | |
GPIOPin::new(Pin::P1_00), | |
GPIOPin::new(Pin::P1_01), | |
GPIOPin::new(Pin::P1_02), | |
GPIOPin::new(Pin::P1_03), | |
GPIOPin::new(Pin::P1_04), | |
GPIOPin::new(Pin::P1_05), | |
GPIOPin::new(Pin::P1_06), | |
GPIOPin::new(Pin::P1_07), | |
GPIOPin::new(Pin::P1_08), | |
GPIOPin::new(Pin::P1_09), | |
GPIOPin::new(Pin::P1_10), | |
GPIOPin::new(Pin::P1_11), | |
GPIOPin::new(Pin::P1_12), | |
GPIOPin::new(Pin::P1_13), | |
GPIOPin::new(Pin::P1_14), | |
GPIOPin::new(Pin::P1_15), | |
]; |
My guess is that even though the PORT.pins
field is never mutated after initialization, the Rust compiler assumes that it could be mutated by something else (because PORT
is static mutable), so the compiler cannot optimize PORT[13]
into PINS[13]
(because PORT.pins
may not point to PINS
anymore).
The call to &mut PORT[13]
is syntactic sugar for IndexMut
, which takes a &mut self
reference as input, and therefore PORT
is required to be mut
, even though in fact it is immutable (it's the PIN
array that is mutable).
Given the widespread use of static mut
throughout the code base, I'm wondering what's the impact on code size, but there seem to be missed optimization opportunities here.