sabreur is a command-line tool designed to demultiplex barcoded sequencing reads into separate files. It supports:
- FASTA and FASTQ formats
- Compressed inputs and outputs:
gzip
,bzip2
,xz
, andzstd
- Paired-end and Single-end reads
It uses a barcode file to match reads and dispatches each to the corresponding output. Reads with unknown barcodes go into a separate file.
Powered by niffler for seamless compression support.
sabreur barcode.txt input_R1.fq.gz input_R2.fq.gz
sabreur barcode.txt input.fq
sabreur automatically detects the format and compression. Just provide the inputs!
USAGE:
sabreur [options] <BARCODE> <FORWARD FILE> [<REVERSE FILE>]
ARGS:
<BARCODE> input barcode file
<FORWARD> input forward fastx file
<REVERSE> input reverse fastx file
OPTIONS:
-m, --mismatch <INT> maximum number of mismatches [default: 0]
-o, --out <DIR> ouput directory [default: sabreur_out]
-f, --format <STR> output files compression format
-l, --level <INT> compression level [default: 1]
--force force reuse of output directory
-q, --quiet decrease program verbosity
-h, --help Print help information
-V, --version Print version information
- Rust in stable channel
- libgz for gz file support
- liblzma for xz file support
- libbzip2 for bzip2 file support
- zstd for zstd file support
git clone https://github.com/Ebedthan/sabreur.git
cd sabreur
cargo install --path . --root ~/.cargo
sabreur --help
Download binaries for your platform from the releases page:
- macOS (Apple Silicon): Download • Checksum
- macOS (Intel): Download • Checksum
- Linux (x86_64): Download • Checksum
- Windows (x86_64): Download • Checksum
Benchmarked with hyperfine dataset.
Tool | Single-end uncompressed output | Single-end compressed output | Paired-end uncompressed output | Paired-end compressed output |
---|---|---|---|---|
idemp | - | 211.571 ± 3.718 | - | 366.247 ± 10.482 |
sabre | 32.911 ± 2.411 | - | 109.470 ± 49.909 | - |
sabreur | 10.843 ± 0.531 | 93.840 ± 0.446 | 40.878 ± 13.743 | 187.533 ± 0.572 |
A simple benchmark of the different compression format (sabreur tests/bc_pe_fq.txt tests/input_R1.fastq.gz tests/input_R2.fastq.gz
), zst being the fastest.
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
--format zst |
43.096 ± 1.547 | 41.179 | 46.878 | 1.00 |
--format bz2 |
94.049 ± 4.762 | 87.984 | 101.140 | 2.18 ± 0.14 |
--format gz |
123.107 ± 1.748 | 120.529 | 125.166 | 2.86 ± 0.11 |
--format xz |
285.692 ± 18.625 | 264.960 | 325.750 | 6.63 ± 0.49 |
The barcode file must be tab-delimited in the format:
barcode1 barcode1_file1.fq barcode1_file2.fq
barcode2 barcode2_file1.fq barcode2_file2.fq
...
Output filenames must be unique. You can use .fq, .fastq, .fa, or .fasta as extensions.
sabreur
minimum Rust version is 1.78.0.
- Contributions are welcome under the Contributor Code of Conduct.
- Please open an issue or pull request on GitHub.
Found a bug or have a feature request? → Open an issue.
This project is licensed under the MIT License.