Skip to content

distinct minimizer count integer overflow with large reference fasta #1192

@cjprybol

Description

@cjprybol

Hi @lh3,

I'm using slices of NCBI's NT database for pacbio hifi read mapping. I noticed that the number of minimizers seems to have overflowed the integer count. Would you expect this to impact the trustworthiness of the mapping results, or is this just a cosmetic issue in the stderr reporting?

prior mapping run with nt_others, number of minimizers are positive and percentages are < 100

[M::mm_idx_stat::128.816*2.87] distinct minimizers: 232724554 (61.91% are singletons); average occurrences: 2.649; average spacing: 10.138; total length: 6250598365

mapping nt_prok, number of minimizers are negative

[M::mm_idx_stat::4819.440*3.47] distinct minimizers: -814309831 (-245.31% are singletons); average occurrences: -30.207; average spacing: 10.032; total length: 246767835107

I believe that I am using the latest current release [M::main] Version: 2.28-r1209

Thank you,
Cameron

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions