Skip to content

ripgrep doesn't match arbitrary bytes within a file #1339

@keysmashes

Description

@keysmashes

What version of ripgrep are you using?

ripgrep 11.0.1 (rev 1f1cd9b)
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

How did you install ripgrep?

GitHub deb, I think

What operating system are you using ripgrep on?

Ubuntu 18.04

Describe your question, feature request, or bug.

grep finds arbitrary bytes within a binary file, but ripgrep does not.

If this is a bug, what are the steps to reproduce the behavior?

$ grep $'\xa7' <(printf '\xa7')
Binary file /dev/fd/63 matches
$ echo $?
0
$ rg -uuu --text --binary '\xa7' <(printf '\xa7')
$ echo $?
1

If this is a bug, what is the actual behavior?

$ rg -uuu --text --binary --debug '\xa7' <(printf '\xa7')
DEBUG|grep_regex::literal|grep-regex/src/literal.rs:59: literal prefixes detected: Literals { lits: [Complete(§)], limit_size: 250, limit_class: 10 }
DEBUG|globset|globset/src/lib.rs:435: built glob set; 0 literals, 0 basenames, 11 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|globset/src/lib.rs:435: built glob set; 0 literals, 0 basenames, 11 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|globset/src/lib.rs:435: built glob set; 0 literals, 0 basenames, 11 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes

If this is a bug, what is the expected behavior?

Either ripgrep should be able to search for arbitrary bytes, or it should not print a message implying that it can do so:

$ rg $'\xa7'
found invalid UTF-8 in pattern at byte offset 0 (use hex escape sequences to match arbitrary bytes in a pattern, e.g., \xFF): '\xA7'

Metadata

Metadata

Assignees

No one assigned

    Labels

    docAn issue with or an improvement to documentation.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions