Skip to content

missing ".with_cigar()" in Aligner? #19

@wchengt

Description

@wchengt

Hello! Thank you for creating chopper for us. However, I noticed when I was trying to remove DCS reads from my fastq files that a good portion of contaminating reads still remain. This is an example of one read blasted against the DCS sequence.
contam_read

(query) bad read: 3,800 bp (90% =3,420bp)
(target) DCS: 3,560 bp

Chopper left these reads, so I decided to manually run minimap2 -ax map-ont DCS.fasta read.fq to see the PAF results. My "match_len" was 3,510bp. Please correct me if I'm misinterpreting the filter function, but I assume because 3,510bp > 3,420bp it should be classified as a contaminate.

Alternatively if i run minimap2 -x map-ont DCS.fasta read.fq my "match_len" was 3,268bp. Because it is not greater than 3,420bp the read would be retained. Could chopper be inaccurately reporting the lengths because the Aligner setup in lines: 178-184 is missing ".with_cigar()"? lh3/minimap2#158

fn setup_contamination_filter(contam_fasta: &str) -> Aligner {
    Aligner::builder()
        .with_threads(8)
        .map_ont()
        .with_index(contam_fasta, None)
        .expect("Unable to build index")
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions