Skip to content

bioconda install of bcalm produces incorrect cDBG on linux? #37

@ctb

Description

@ctb

In #34, I reported that in not-very-reproducible circumstances on my HPC, when I compiled bcalm from source, it produced bad cDBG files. I ended up automating the heck out of the check and being unable to reproduce it so I gave up. But I cleverly never removed the validation code from the spacegraphcats application :)

Fast forward to today, and I found the same problem as in #34 when I installed bcalm using bioconda! Hopefully this time it will be more reproducible. I reproduce the bug on both my HPC and Amazon's Linux box, which kind of makes sense because bioconda provides a binary install so I would hope that it would fail the same way on both :).

To reproduce using spacegraphcats

this is running on Amazon Linux 2 AMI (HVM), SSD Volume Type - ami-01bbe152bf19d0289, region us-west-2. I created a t2-large with 20 GB of disk space, logged in as ec2-user, and then ran the following commands (some require answering interactive prompts, sorry):

# install C compiler
sudo yum groupinstall "Development Tools"

# install conda
curl -L -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh 
source ~/.bashrc
. /home/ec2-user/miniconda3/etc/profile.d/conda.sh

# add bioconda
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

# create a conda environment called 'sgc' with python 3.6 and baclm installed.
conda create -n sgc python==3.6 bcalm

Now, activate and try running the spacegraphcats example:

conda activate sgc

# install prereqs for spacegraphcats
pip install Cython
pip install https://github.com/dib-lab/pybbhash/archive/spacegraphcats.zip
pip install https://github.com/dib-lab/khmer/archive/master.zip
pip install git+https://github.com/dib-lab/sourmash@master#egg=sourmash

# install spacegraphcats
pip install spacegraphcats

# grab example repo
git clone https://github.com/spacegraphcats/spacegraphcats-twofoo-example.git
cd spacegraphcats-twofoo-example/

# run example workflow
snakemake

this fails for me after a few minutes of running with:

/home/ec2-user/miniconda3/envs/sgc/bin/python -m spacegraphcats.cdbg.bcalm_to_gxt  -k 31 twofoo/bcalm.twofoo.k31.unitigs.fa twofoo_k31_r1/cdbg.gxt twofoo_k31_r1/contigs.fa.gz
reading unitigs from twofoo/bcalm.twofoo.k31.unitigs.fa
...read 206855 unitigs
577 -> 181159, but not 181159 -> 577
576 -> 81582, but not 81582 -> 576
4675 -> 101520, but not 101520 -> 4675
4675 -> 104265, but not 104265 -> 4675
...
204785 -> 12508, but not 12508 -> 204785
Traceback (most recent call last):
  File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/site-packages/spacegraphcats/cdbg/bcalm_to_gxt.py", line 375, in <module>
    main(sys.argv[1:])
  File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/site-packages/spacegraphcats/cdbg/bcalm_to_gxt.py", line 257, in main
    neighbors, sequences, mean_abunds, sizes = read_bcalm(unitigs, debug, k)
  File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/site-packages/spacegraphcats/cdbg/bcalm_to_gxt.py", line 214, in read_bcalm
    assert not fail
AssertionError
[Sun Dec 23 20:23:27 2018]
Error in rule bcalm_catlas_input:
...

Here's what I have installed for bcalm:

% conda list | grep bcalm
bcalm                     2.2.0                hd28b015_3    bioconda

cc @natir viz #32.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions