-
Notifications
You must be signed in to change notification settings - Fork 21
Description
In #34, I reported that in not-very-reproducible circumstances on my HPC, when I compiled bcalm from source, it produced bad cDBG files. I ended up automating the heck out of the check and being unable to reproduce it so I gave up. But I cleverly never removed the validation code from the spacegraphcats application :)
Fast forward to today, and I found the same problem as in #34 when I installed bcalm using bioconda! Hopefully this time it will be more reproducible. I reproduce the bug on both my HPC and Amazon's Linux box, which kind of makes sense because bioconda provides a binary install so I would hope that it would fail the same way on both :).
To reproduce using spacegraphcats
this is running on Amazon Linux 2 AMI (HVM), SSD Volume Type - ami-01bbe152bf19d0289, region us-west-2. I created a t2-large with 20 GB of disk space, logged in as ec2-user
, and then ran the following commands (some require answering interactive prompts, sorry):
# install C compiler
sudo yum groupinstall "Development Tools"
# install conda
curl -L -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
source ~/.bashrc
. /home/ec2-user/miniconda3/etc/profile.d/conda.sh
# add bioconda
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
# create a conda environment called 'sgc' with python 3.6 and baclm installed.
conda create -n sgc python==3.6 bcalm
Now, activate and try running the spacegraphcats example:
conda activate sgc
# install prereqs for spacegraphcats
pip install Cython
pip install https://github.com/dib-lab/pybbhash/archive/spacegraphcats.zip
pip install https://github.com/dib-lab/khmer/archive/master.zip
pip install git+https://github.com/dib-lab/sourmash@master#egg=sourmash
# install spacegraphcats
pip install spacegraphcats
# grab example repo
git clone https://github.com/spacegraphcats/spacegraphcats-twofoo-example.git
cd spacegraphcats-twofoo-example/
# run example workflow
snakemake
this fails for me after a few minutes of running with:
/home/ec2-user/miniconda3/envs/sgc/bin/python -m spacegraphcats.cdbg.bcalm_to_gxt -k 31 twofoo/bcalm.twofoo.k31.unitigs.fa twofoo_k31_r1/cdbg.gxt twofoo_k31_r1/contigs.fa.gz
reading unitigs from twofoo/bcalm.twofoo.k31.unitigs.fa
...read 206855 unitigs
577 -> 181159, but not 181159 -> 577
576 -> 81582, but not 81582 -> 576
4675 -> 101520, but not 101520 -> 4675
4675 -> 104265, but not 104265 -> 4675
...
204785 -> 12508, but not 12508 -> 204785
Traceback (most recent call last):
File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/site-packages/spacegraphcats/cdbg/bcalm_to_gxt.py", line 375, in <module>
main(sys.argv[1:])
File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/site-packages/spacegraphcats/cdbg/bcalm_to_gxt.py", line 257, in main
neighbors, sequences, mean_abunds, sizes = read_bcalm(unitigs, debug, k)
File "/home/ec2-user/miniconda3/envs/sgc/lib/python3.6/site-packages/spacegraphcats/cdbg/bcalm_to_gxt.py", line 214, in read_bcalm
assert not fail
AssertionError
[Sun Dec 23 20:23:27 2018]
Error in rule bcalm_catlas_input:
...
Here's what I have installed for bcalm:
% conda list | grep bcalm
bcalm 2.2.0 hd28b015_3 bioconda