Skip to content

some cDBG links seem not to be bidirectional #34

@ctb

Description

@ctb

in spacegraphcats, we are consuming the output of your excellent tool and doing (or attempting to do) Clever Graph Things with it. (thank you for bcalm!) however, we are running into a problem that in some cases there are nodes u, v where node u has an edge to v but node v does not have an edge to u.

more specifically, we see a number of examples like the following two headers in the unitigs output:

>206849 LN:i:41 KC:i:96 km:f:8.7  L:-:55026:- L:-:69342:-  L:+:103808:- L:+:203510:-
...
>69342 LN:i:134 KC:i:2260 km:f:21.7  L:+:69614:- L:+:155497:-  L:-:61581:+ L:-:167785:+
...

which show that 69342 is in the edge list of 206849, but 206849 is not in the edge list of 69342.

We have code to detect this here,

spacegraphcats/spacegraphcats@dc9a9cc

and I am happy to provide full execution instructions should you wish (should be: checkout that branch; pip install -r requirements.txt; make twofoo.fq.gz; conf/run twofoo build)

your thoughts appreciated!


output from latest bcalm master is here: https://osf.io/zgx3r/ (generated with bcalm built from commit c41f70f on my mac os x laptop)

input files are generated from the rules here,

briefly

curl -o akker-reads.abundtrim.gz -L https://osf.io/dk7nb/download
curl -L 'https://osf.io/7az9p/?action=download' > shew-reads.abundtrim.gz
gunzip -c shew-reads.abundtrim.gz akker-reads.abundtrim.gz | gzip -9c > twofoo.fq.gz

and then the command line execution is:

bcalm -in bcalm.twofoo.k31.inputlist.txt -out twofoo/bcalm.twofoo.k31 -kmer-size 31 -abundance-min 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions