Skip to content

Missed optimization opportunities with static mut data? #1662

@gendx

Description

@gendx

Given that the topic of optimizing binary size came up in #1660, I want to mention that I recently observed what seems to be another missed optimization opportunity.

Looking at the nRF52840-DK's panic handler code, a const LED is chosen as a blinking signal, and therefore the corresponding GPIO is obtained from nrf52840::gpio::PORT. The LED index is compile-time, so I was expecting the array indexing to be optimized away.

pub unsafe extern "C" fn panic_fmt(pi: &PanicInfo) -> ! {
// The nRF52840DK LEDs (see back of board)
const LED1_PIN: Pin = Pin::P0_13;
let led = &mut led::LedLow::new(&mut nrf52840::gpio::PORT[LED1_PIN]);
let writer = &mut WRITER;
debug::panic(
&mut [led],
writer,
pi,
&cortexm4::support::nop,
&PROCESSES,
&CHIP,
)
}

However, the disassembled code is the following:

00007d4e <core::panicking::panic_fmt>:
    7d4e:	b580      	push	{r7, lr}
    7d50:	466f      	mov	r7, sp
    7d52:	b084      	sub	sp, #16
    7d54:	e9cd 0102 	strd	r0, r1, [sp, #8]
    7d58:	f646 1074 	movw	r0, #26996	; 0x6974
    7d5c:	f2c0 0001 	movt	r0, #1
    7d60:	9001      	str	r0, [sp, #4]
    7d62:	f24c 2060 	movw	r0, #49760	; 0xc260
    7d66:	f2c0 0001 	movt	r0, #1
    7d6a:	9000      	str	r0, [sp, #0]
    7d6c:	4668      	mov	r0, sp
    7d6e:	f006 fe35 	bl	e9dc <rust_begin_unwind>
    7d72:	defe      	udf	#254	; 0xfe

0000e9dc <rust_begin_unwind>:
    e9dc:	b5f8      	push	{r3, r4, r5, r6, r7, lr}
    e9de:	af04      	add	r7, sp, #16
    e9e0:	4604      	mov	r4, r0
    e9e2:	f006 fd3b 	bl	1545c <<nrf5x::gpio::Port as core::ops::index::IndexMut<nrf5x::gpio::Pin>>::index_mut>
    e9e6:	4905      	ldr	r1, [pc, #20]	; (e9fc <rust_begin_unwind+0x20>)
    e9e8:	e9cd 0101 	strd	r0, r1, [sp, #4]
    e9ec:	a801      	add	r0, sp, #4
    e9ee:	9003      	str	r0, [sp, #12]
    e9f0:	a803      	add	r0, sp, #12
    e9f2:	4621      	mov	r1, r4
    e9f4:	f003 ff44 	bl	12880 <kernel::debug::panic>
    e9f8:	defe      	udf	#254	; 0xfe
    e9fa:	bf00      	nop
    e9fc:	0001b5b0 	.word	0x0001b5b0

In particular, I wasn't expecting to see the following instruction in rust_begin_unwind: <<nrf5x::gpio::Port as core::ops::index::IndexMut<nrf5x::gpio::Pin>>::index_mut>.

It turns out that the definition of PORT is static mut.

pub static mut PORT: Port = Port {
pins: unsafe { &mut PINS },
};

Same goes for the PIN array.

static mut PINS: [GPIOPin; 48] = [
GPIOPin::new(Pin::P0_00),
GPIOPin::new(Pin::P0_01),
GPIOPin::new(Pin::P0_02),
GPIOPin::new(Pin::P0_03),
GPIOPin::new(Pin::P0_04),
GPIOPin::new(Pin::P0_05),
GPIOPin::new(Pin::P0_06),
GPIOPin::new(Pin::P0_07),
GPIOPin::new(Pin::P0_08),
GPIOPin::new(Pin::P0_09),
GPIOPin::new(Pin::P0_10),
GPIOPin::new(Pin::P0_11),
GPIOPin::new(Pin::P0_12),
GPIOPin::new(Pin::P0_13),
GPIOPin::new(Pin::P0_14),
GPIOPin::new(Pin::P0_15),
GPIOPin::new(Pin::P0_16),
GPIOPin::new(Pin::P0_17),
GPIOPin::new(Pin::P0_18),
GPIOPin::new(Pin::P0_19),
GPIOPin::new(Pin::P0_20),
GPIOPin::new(Pin::P0_21),
GPIOPin::new(Pin::P0_22),
GPIOPin::new(Pin::P0_23),
GPIOPin::new(Pin::P0_24),
GPIOPin::new(Pin::P0_25),
GPIOPin::new(Pin::P0_26),
GPIOPin::new(Pin::P0_27),
GPIOPin::new(Pin::P0_28),
GPIOPin::new(Pin::P0_29),
GPIOPin::new(Pin::P0_30),
GPIOPin::new(Pin::P0_31),
GPIOPin::new(Pin::P1_00),
GPIOPin::new(Pin::P1_01),
GPIOPin::new(Pin::P1_02),
GPIOPin::new(Pin::P1_03),
GPIOPin::new(Pin::P1_04),
GPIOPin::new(Pin::P1_05),
GPIOPin::new(Pin::P1_06),
GPIOPin::new(Pin::P1_07),
GPIOPin::new(Pin::P1_08),
GPIOPin::new(Pin::P1_09),
GPIOPin::new(Pin::P1_10),
GPIOPin::new(Pin::P1_11),
GPIOPin::new(Pin::P1_12),
GPIOPin::new(Pin::P1_13),
GPIOPin::new(Pin::P1_14),
GPIOPin::new(Pin::P1_15),
];

My guess is that even though the PORT.pins field is never mutated after initialization, the Rust compiler assumes that it could be mutated by something else (because PORT is static mutable), so the compiler cannot optimize PORT[13] into PINS[13] (because PORT.pins may not point to PINS anymore).

The call to &mut PORT[13] is syntactic sugar for IndexMut, which takes a &mut self reference as input, and therefore PORT is required to be mut, even though in fact it is immutable (it's the PIN array that is mutable).

Given the widespread use of static mut throughout the code base, I'm wondering what's the impact on code size, but there seem to be missed optimization opportunities here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions