Skip to content

No found reference sequences when headers look the same #86

@shafferm

Description

@shafferm

Hello @wwood,

I am using coverm (version coverm 0.6.1 from bioconda) to get average coverage for some metagenome assemblies to deposit to NCBI. The command I am running is coverm genome -v -m mean -t 15 --bam-files /home/projects-wrighton/NIH_Salmonella/Salmonella/Metagenomes/Megahit/LM_megahit/LM.mapped.sorted.bam -f /home/projects-wrighton/NIH_Salmonella/Salmonella/Metagenomes/Megahit/LM_megahit/final.contigs.2500.fa. When I run this error:

[2021-08-30T21:24:43Z INFO  coverm::genome] Of 18649 reference IDs, 0 were assigned to a genome and 18649 were not
[2021-08-30T21:24:43Z ERROR coverm::genome] Error: There are no found reference sequences that are a part of a genome

The mappings were generated from this reference using bbmap. To dig into this I grabbed the headers from the fasta file using grep (uploaded here: LM_headers.txt) and the headers from the bam file using idxstats (uploaded here: LM.mapped.sorted.idxstats.txt). When I look between them it looks like the headers match. Am I missing something here? Is something in the headers making the matching break?

Thanks,
Mike

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions