Skip to content

Performance & hash based whitelisting #592

@gwillem

Description

@gwillem

What is the recommend way to implement hash based whitelists in Yara? Some projects such as php malware finder do it similar like this:

import "hash"
global private rule Whitelist {
	condition:
		hash.sha1(0, filesize) != "c9cf738d8b1a8a77f6d200f327c5d4ec8201a99d" and
                hash.sha1(0, filesize) != "40a0a6e5ff86f75e6723e0008ddae29b1ed384c8" and
                [...]
}

However, this seems to take O(n) time while I would expect O(1).
Proof, timing with a single sha1sum:

$ time yara -r whitelist-1-hash.yar magento-2.0
real	0m2.780s
user	0m4.056s
sys	0m2.344s

$ time yara -r whitelist-100-hashes.yar magento-2.0
real	0m38.553s
user	2m15.468s
sys	0m2.348s

So checking for 100 hashes takes 35 times as much CPU power as 1 hash. What is the best way to whitelist thousands of hashes?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions