Skip to content

Very very slow processing of samples #537

@shalercr

Description

@shalercr

Is the bug primarily related to salmon (bulk mode) or alevin (single-cell mode)?
salmon
Describe the bug
salmon takes days to process through 16 samples

To Reproduce
Tried running on a different computer, same issue.

Specifically, please provide at least the following information:

  • Which version of salmon was used?
    1.2.1
  • How was salmon installed (compiled, downloaded executable, through bioconda)?
    bioconda
  • Which reference (e.g. transcriptome) was used?
    mouse Mus_musculus.GRCm38.cdna.all.fa
  • Which read files were used?
    fastq
  • Which which program options were used?
    Various
    the --hitfilterpolicy BOTH was just added following reading another user query to see if that would help. (Much slower than previous versions #527)
### salmon (mapping-based) v1.2.1
### [ program ] => salmon 
### [ command ] => quant 
### [ index ] => { mouse_index1 }
### [ libType ] => { IU }
### [ mates1 ] => { /Volumes/FIle_backup/Genewiz_Fastq_June_2020/qc/13_1.trimmed.fastq.gz }
### [ mates2 ] => { /Volumes/FIle_backup/Genewiz_Fastq_June_2020/qc/13_2.trimmed.fastq.gz }
### [ threads ] => { 4 }
### [ hitFilterPolicy ] => { BOTH }
### [ biasSpeedSamp ] => { 10 }
### [ output ] => { quants/13_quant }

Expected behavior
Not to be so slow.... I've used salmon previously with some SRR datasets and it was very fast, this seems very strange to me

Screenshots

[2020-06-13 02:34:43.686] [jointLog] [info] setting maxHashResizeThreads to 4
[2020-06-13 02:34:43.686] [jointLog] [info] Fragment incompatibility prior below threshold.  Incompatible fragments will be ignored.
[2020-06-13 02:34:43.686] [jointLog] [info] Usage of --validateMappings implies use of minScoreFraction. Since not explicitly specified, it is being set to 0.65
[2020-06-13 02:34:43.686] [jointLog] [info] Usage of --validateMappings implies a default consensus slack of 0.2. Setting consensusSlack to 0.35.
[2020-06-13 02:34:43.686] [jointLog] [info] parsing read library format
[2020-06-13 02:34:43.686] [jointLog] [info] There is 1 library.
[2020-06-13 02:34:43.738] [jointLog] [info] Loading pufferfish index
[2020-06-13 02:34:43.738] [jointLog] [info] Loading dense pufferfish index.
[2020-06-13 02:34:45.327] [jointLog] [info] done
[2020-06-13 02:34:45.327] [jointLog] [info] Index contained 117,135 targets
[2020-06-13 02:34:45.346] [jointLog] [info] Number of decoys : 0
[2020-06-13 02:35:35.911] [jointLog] [info] Automatically detected most likely library type as IU
[2020-06-13 06:56:12.646] [fileLog] [info] 
At end of round 0
==================
Observed 28512328 total fragments (28512328 in most recent round)

[2020-06-13 06:56:12.645] [jointLog] [info] Computed 348,024 rich equivalence classes for further processing
[2020-06-13 06:56:12.645] [jointLog] [info] Counted 12,990,838 total reads in the equivalence classes 
[2020-06-13 06:56:12.989] [jointLog] [warning] 0.0736383% of fragments were shorter than the k used to build the index.
If this fraction is too large, consider re-building the index with a smaller k.
The minimum read size found was 1.


[2020-06-13 06:56:12.989] [jointLog] [info] Number of mappings discarded because of alignment score : 19,645,245,772
[2020-06-13 06:56:12.989] [jointLog] [info] Number of fragments entirely discarded because of alignment score : 2,436,564
[2020-06-13 06:56:12.989] [jointLog] [info] Number of fragments discarded because they are best-mapped to decoys : 0
[2020-06-13 06:56:12.989] [jointLog] [info] Number of fragments discarded because they have only dovetail (discordant) mappings to valid targets : 1,448,149
[2020-06-13 06:56:12.989] [jointLog] [info] Mapping rate = 45.5622%

[2020-06-13 06:56:12.989] [jointLog] [info] finished quantifyLibrary()
[2020-06-13 06:56:12.991] [jointLog] [info] Starting optimizer
[2020-06-13 06:56:13.091] [jointLog] [info] Marked 0 weighted equivalence classes as degenerate
[2020-06-13 06:56:13.106] [jointLog] [info] iteration = 0 | max rel diff. = 8178.65
[2020-06-13 06:56:14.511] [jointLog] [info] iteration = 100 | max rel diff. = 17.6849
[2020-06-13 06:56:16.028] [jointLog] [info] iteration = 200 | max rel diff. = 6.46204
[2020-06-13 06:56:17.541] [jointLog] [info] iteration = 300 | max rel diff. = 1.8111
[2020-06-13 06:56:19.027] [jointLog] [info] iteration = 400 | max rel diff. = 12.2108
[2020-06-13 06:56:20.501] [jointLog] [info] iteration = 500 | max rel diff. = 0.616929
[2020-06-13 06:56:21.954] [jointLog] [info] iteration = 600 | max rel diff. = 0.218435
[2020-06-13 06:56:23.439] [jointLog] [info] iteration = 700 | max rel diff. = 0.068711
[2020-06-13 06:56:24.945] [jointLog] [info] iteration = 800 | max rel diff. = 0.044637
[2020-06-13 06:56:26.398] [jointLog] [info] iteration = 900 | max rel diff. = 0.0340291
[2020-06-13 06:56:27.811] [jointLog] [info] iteration = 1,000 | max rel diff. = 0.237808
[2020-06-13 06:56:29.235] [jointLog] [info] iteration = 1,100 | max rel diff. = 0.0764161
[2020-06-13 06:56:30.717] [jointLog] [info] iteration = 1,200 | max rel diff. = 0.0683725
[2020-06-13 06:56:32.253] [jointLog] [info] iteration = 1,300 | max rel diff. = 0.0990377
[2020-06-13 06:56:33.509] [jointLog] [info] iteration = 1,389 | max rel diff. = 0.00998936
[2020-06-13 06:56:33.511] [jointLog] [info] Finished optimizer
[2020-06-13 06:56:33.511] [jointLog] [info] writing output 

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu Linux, OSX]
  • Version [ If you are on OSX, the output of sw_vers. If you are on linux the output of uname -a and lsb_release -a]
    ProductName: Mac OS X
    ProductVersion: 10.15.5
    BuildVersion: 19F101

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions