Skip to content

Undefined behaviour in telfhash causes different results in big-endian machines. #1874

@plusvic

Description

@plusvic

Describe the bug

The test case below produces a different result in big-endian machines. Part of the issue was in the ELF module itself, but it was fixed in 32ae80d. However, after that fix I discovered another issue in the implementation of telfhash.

yara/tests/test-elf.c

Lines 263 to 270 in 32ae80d

assert_true_rule_file(
"import \"elf\" \
rule test { \
condition: \
elf.telfhash() == \
\"T174B012188204F00184540770331E0B111373086019509C464D0ACE88181266C09774FA\" \
}",
"tests/data/elf_with_imports");

The telfhash produced for that test case in big-endian machines is:

T174B021188204F00184540770331E0B111373086019509C464D0ACE88181266C09774FA

Notice that they are very similar, except for the third byte in the hash, where the nibbles are swapped:

T174B012..
T174B021..

The issue is related to this union

union
{
unsigned char qb;
struct
{
unsigned char q1ratio : 4;
unsigned char q2ratio : 4;
} QR;
} Q;

Apparently the order of fields q1ratio and q2ratio depend on the machine endianness, but the those fields are later read using the union's qb member, which causes a different value for qb. This looks like telfhash is relying on undefined behaviour of the C compiler.

Screenshots
If applicable, add screenshots to help explain your problem.

Please complete the following information:

  • OS: All, depend on machine endianness.
  • YARA version: 4.3.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions