-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Open
Labels
C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
The problem is originated from rust-lang/log#599.
Sometimes, multiple function calls can have the same constant arguments:
pub fn test(f: fn(&[u32; 10])) {
f(&[7; 10]);
f(&[7; 10]);
f(&[7; 10]);
f(&[7; 10]);
}
Rust recognizes that these arguments are the same value, so it would create a single constant value and pass it to each function:
example::test:
push r14
push rbx
push rax
mov r14, rdi
lea rbx, [rip + .L__unnamed_1]
mov rdi, rbx
call r14
mov rdi, rbx
call r14
mov rdi, rbx
call r14
mov rdi, rbx
mov rax, r14
add rsp, 8
pop rbx
pop r14
jmp rax
.L__unnamed_1:
.asciz "\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000"
But sometimes, these constant arguments have to be computed by some additional functions:
pub fn test(f: fn(&[u32; 10])) {
f(&[std::convert::identity(7); 10]);
f(&[std::convert::identity(7); 10]);
f(&[std::convert::identity(7); 10]);
f(&[std::convert::identity(7); 10]);
}
Then the compiler can’t optimize these constant objects well as the first example, additional copying operations are generated:
.LCPI0_0:
.long 7
.long 7
.long 7
.long 7
example::test:
push r14
push rbx
sub rsp, 40
mov rbx, rdi
movaps xmm0, xmmword ptr [rip + .LCPI0_0]
movaps xmmword ptr [rsp], xmm0
movaps xmmword ptr [rsp + 16], xmm0
movabs r14, 30064771079
mov qword ptr [rsp + 32], r14
mov rdi, rsp
call rbx
movaps xmm0, xmmword ptr [rip + .LCPI0_0]
movaps xmmword ptr [rsp], xmm0
movaps xmmword ptr [rsp + 16], xmm0
mov qword ptr [rsp + 32], r14
mov rdi, rsp
call rbx
movaps xmm0, xmmword ptr [rip + .LCPI0_0]
movaps xmmword ptr [rsp], xmm0
movaps xmmword ptr [rsp + 16], xmm0
mov qword ptr [rsp + 32], r14
mov rdi, rsp
call rbx
movaps xmm0, xmmword ptr [rip + .LCPI0_0]
movaps xmmword ptr [rsp], xmm0
movaps xmmword ptr [rsp + 16], xmm0
mov qword ptr [rsp + 32], r14
mov rdi, rsp
call rbx
add rsp, 40
pop rbx
pop r14
ret
You can see the comparison here: https://godbolt.org/z/frj9a8TG6.
Additionally, using a const value as a proxy helps:
pub fn test(f: fn(&[u32; 10])) {
const SEVEN: u32 = std::convert::identity(7);
f(&[SEVEN; 10]);
f(&[SEVEN; 10]);
f(&[SEVEN; 10]);
f(&[SEVEN; 10]);
}
But some functions can’t be used to compute a const value, such as std::panic::Location::caller
, so the method above does not always work.
Metadata
Metadata
Assignees
Labels
C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.