Skip to content

SPAdes assembler crashed due to odd read correction #188

@oschwengers

Description

@oschwengers

Hi and thanks a lot for this great tool!
I use Unicycler a lot and so far it almost always did a great job.

I recently QCed and assembled SRR1609861 (Illumina PE only) with fastp and Unicycler and for some reason Unicycler crashes in the first kmer (K27) assembly iteration right after the read error correction step:

Error: SPAdes failed to produce assemblies. See spades_assembly/assembly/spades.log for more info

The SPAdes log says:

The number of right read-pairs is larger than the number of left read-pairs
Unequal number of read-pairs detected in the following files: /var/scratch/2014C-3598-fastp/spades_assembly/corrected_1.fastq.gz  /var/scratch/2014C-3598-fastp/spades_assembly/corrected_2.fastq.gz

The Unicycler cmd:

$ unicycler -1 1.fastq.gz -2 2.fastq.gz -s se.fastq.gz -o . --verbosity 3 --keep 3 -t 32

se.fastq.gz contains unpaired reads surviving the QC but lacking a valid mate.

Indeed, the Unicycler-internal SPAdes-corrected read files are erroneous as the forward file only contains a fraction of the actual reads:

$ ll spades_assembly/
total 57M
drwxr-xr-x 4 oschweng cb 4.0K May 15 14:00 ./
drwxr-xr-x 3 oschweng cb 4.0K May 15 13:58 ../
drwxr-xr-x 6 oschweng cb 4.0K May 15 14:00 assembly/
-rw-r--r-- 1 oschweng cb 5.5M May 15 14:00 corrected_1.fastq.gz
-rw-r--r-- 1 oschweng cb  51M May 15 14:00 corrected_2.fastq.gz
-rw-r--r-- 1 oschweng cb 1.3M May 15 14:00 corrected_u.fastq.gz
-rw-r--r-- 1 oschweng cb   42 May 15 14:00 kmer_range
drwxr-xr-x 4 oschweng cb 4.0K May 15 14:00 read_correction/

The fastp PE output seems OK and has exact the same number of reads:

zcat 1.fastq.gz | grep -c '@SRR'
246147
zcat 2.fastq.gz | grep -c '@SRR'
246147

Interestingly, a normal SPAdes assembly with the same SPAdes version (3.13.0) enabling internal read error correction finished without any problems.

I'm running the latest Unicycler (v0.4.7) on a native Ubuntu with 64 cores (HT), 256 Gb memory and local storage; so no VM issues should be involved here. I could also reproduce this issue on a different machine.

I'm equally puzzled and curious to know what exactly is the cause for this crash. Any help very appreciated! Please, let me know if you need anything else.
Best regards!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions