Skip to content

Non-literal constant objects are not well optimized comparing to literal constant objects #118557

@EFanZh

Description

@EFanZh

The problem is originated from rust-lang/log#599.

Sometimes, multiple function calls can have the same constant arguments:

pub fn test(f: fn(&[u32; 10])) {
    f(&[7; 10]);
    f(&[7; 10]);
    f(&[7; 10]);
    f(&[7; 10]);
}

Rust recognizes that these arguments are the same value, so it would create a single constant value and pass it to each function:

example::test:
        push    r14
        push    rbx
        push    rax
        mov     r14, rdi
        lea     rbx, [rip + .L__unnamed_1]
        mov     rdi, rbx
        call    r14
        mov     rdi, rbx
        call    r14
        mov     rdi, rbx
        call    r14
        mov     rdi, rbx
        mov     rax, r14
        add     rsp, 8
        pop     rbx
        pop     r14
        jmp     rax

.L__unnamed_1:
        .asciz  "\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000"

But sometimes, these constant arguments have to be computed by some additional functions:

pub fn test(f: fn(&[u32; 10])) {
    f(&[std::convert::identity(7); 10]);
    f(&[std::convert::identity(7); 10]);
    f(&[std::convert::identity(7); 10]);
    f(&[std::convert::identity(7); 10]);
}

Then the compiler can’t optimize these constant objects well as the first example, additional copying operations are generated:

.LCPI0_0:
        .long   7
        .long   7
        .long   7
        .long   7
example::test:
        push    r14
        push    rbx
        sub     rsp, 40
        mov     rbx, rdi
        movaps  xmm0, xmmword ptr [rip + .LCPI0_0]
        movaps  xmmword ptr [rsp], xmm0
        movaps  xmmword ptr [rsp + 16], xmm0
        movabs  r14, 30064771079
        mov     qword ptr [rsp + 32], r14
        mov     rdi, rsp
        call    rbx
        movaps  xmm0, xmmword ptr [rip + .LCPI0_0]
        movaps  xmmword ptr [rsp], xmm0
        movaps  xmmword ptr [rsp + 16], xmm0
        mov     qword ptr [rsp + 32], r14
        mov     rdi, rsp
        call    rbx
        movaps  xmm0, xmmword ptr [rip + .LCPI0_0]
        movaps  xmmword ptr [rsp], xmm0
        movaps  xmmword ptr [rsp + 16], xmm0
        mov     qword ptr [rsp + 32], r14
        mov     rdi, rsp
        call    rbx
        movaps  xmm0, xmmword ptr [rip + .LCPI0_0]
        movaps  xmmword ptr [rsp], xmm0
        movaps  xmmword ptr [rsp + 16], xmm0
        mov     qword ptr [rsp + 32], r14
        mov     rdi, rsp
        call    rbx
        add     rsp, 40
        pop     rbx
        pop     r14
        ret

You can see the comparison here: https://godbolt.org/z/frj9a8TG6.

Additionally, using a const value as a proxy helps:

pub fn test(f: fn(&[u32; 10])) {
    const SEVEN: u32 = std::convert::identity(7);

    f(&[SEVEN; 10]);
    f(&[SEVEN; 10]);
    f(&[SEVEN; 10]);
    f(&[SEVEN; 10]);
}

But some functions can’t be used to compute a const value, such as std::panic::Location::caller, so the method above does not always work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions