-
Notifications
You must be signed in to change notification settings - Fork 169
Description
Hi,
I would like quantify guide-RNAs (based on 5'-tagged scRNAseq 10X feature barcoding) using Alevin. Read 1 is 26bps long (16 CB +10 UMI) and Read 2 is 58bps long (19 constant region + 21 guide sequence). Now, when I use the following settings
salmon alevin -l ISR --barcodeLength 16 --umiLength 10 --end 5 --featureStart 19 --featureLength 21
I get this error
Transcript to Gene Map File not provided
.
However, when I use the following instead
salmon alevin -l ISR --citeseq --featureStart 19 --featureLength 21
It works but since --citeseq
assumes --umiLength=12
, I get the following output
`[2020-06-03 13:53:30.298] [alevinLog] [info] set CITE-seq minScoreFraction parameter to : 0.797619
[2020-06-03 13:53:30.298] [alevinLog] [info] Found 64 transcripts(+0 decoys, +0 short and +0 duplicate names in the index)
[2020-06-03 13:53:30.298] [alevinLog] [info] Filled with 64 txp to gene entries
[2020-06-03 13:53:30.298] [alevinLog] [info] Found all transcripts to gene mappings
[2020-06-03 13:53:30.304] [alevinLog] [info] Processing barcodes files (if Present)processed 52 Million barcodes
[2020-06-03 13:54:43.733] [alevinLog] [info] Done barcode density calculation.
[2020-06-03 13:54:43.733] [alevinLog] [info] # Barcodes Used: 52200250 / 52200250.
[2020-06-03 13:54:43.826] [alevinLog] [info] Forcing to use 100000 cells
[2020-06-03 13:54:43.964] [alevinLog] [info] Throwing 49909 barcodes with < 10 reads
[2020-06-03 13:54:43.984] [alevinLog] [info] Total 50092(has 201 low confidence) barcodes
[2020-06-03 13:54:44.191] [alevinLog] [info] Done True Barcode Sampling
[2020-06-03 13:54:44.285] [alevinLog] [info] Total 1.70493% reads will be thrown away because of noisy Cellular barcodes.
[2020-06-03 13:54:45.790] [alevinLog] [info] Done populating Z matrix
[2020-06-03 13:54:45.790] [alevinLog] [info] Total 0 CB got sequence corrected
[2020-06-03 13:54:45.790] [alevinLog] [info] Done indexing Barcodes
[2020-06-03 13:54:45.790] [alevinLog] [info] Total Unique barcodes found: 604589
[2020-06-03 13:54:45.790] [alevinLog] [info] Used Barcodes except Whitelist: 0
[2020-06-03 13:54:46.493] [jointLog] [info] There is 1 library.
[2020-06-03 13:54:46.551] [jointLog] [info] Loading pufferfish index
[2020-06-03 13:54:46.551] [jointLog] [info] Loading dense pufferfish index.
[2020-06-03 13:54:46.552] [jointLog] [info] done
[2020-06-03 13:54:46.552] [jointLog] [info] Index contained 64 targets
[2020-06-03 13:54:46.552] [jointLog] [info] Number of decoys : 0[2020-06-03 13:54:46.493] [alevinLog] [info] Done with Barcode Processing; Moving to Quantify
processed 52 Million fragmentsvinLog] [info] parsing read library format
hits: 0, hits per frag: 0[2020-06-03 13:55:42.905] [alevinLog] [info] Starting optimizer
[2020-06-03 13:55:42.931] [alevinLog] [warning] mrna file not provided; using is 1 less feature for whitelisting
[2020-06-03 13:55:42.931] [alevinLog] [warning] rrna file not provided; using is 1 less feature for whitelisting
[2020-06-03 13:55:42.933] [alevinLog] [info] Total 0.00 UMI after deduplicating.
[2020-06-03 13:55:42.933] [alevinLog] [info] Total 0 BiDirected Edges.
[2020-06-03 13:55:42.933] [alevinLog] [info] Total 0 UniDirected Edges.
[2020-06-03 13:55:42.933] [alevinLog] [warning] Skipped 50091 barcodes due to No mapped read
[2020-06-03 13:55:42.934] [alevinLog] [info] Clearing EqMap; Might take some time.
[2020-06-03 13:55:42.940] [alevinLog] [warning] Num Low confidence barcodes too less 1 < 200.Can't performing whitelisting; Skipping
[2020-06-03 13:55:42.940] [alevinLog] [info] Finished optimizer
`
I also tried
salmon alevin -l ISR --chromium --featureStart 19 --featureLength 21 --tgMap guide_to_gene.tsv
But I get the following output
`
[2020-06-03 13:47:17.330] [alevinLog] [info] Found 64 transcripts(+0 decoys, +0 short and +0 duplicate names in the index)
[2020-06-03 13:47:17.330] [alevinLog] [info] Filled with 64 txp to gene entries
[2020-06-03 13:47:17.330] [alevinLog] [info] Found all transcripts to gene mappings
[2020-06-03 13:47:17.336] [alevinLog] [info] Processing barcodes files (if Present)processed 52 Million barcodes
[2020-06-03 13:48:30.047] [alevinLog] [info] Done barcode density calculation.
[2020-06-03 13:48:30.047] [alevinLog] [info] # Barcodes Used: 52200250 / 52200250.
[2020-06-03 13:48:33.285] [alevinLog] [info] Knee found left boundary at 1174
[2020-06-03 13:48:34.501] [alevinLog] [info] Gauss Corrected Boundary at 148
[2020-06-03 13:48:34.501] [alevinLog] [info] Learned InvCov: 985.935 normfactor: 763.254
[2020-06-03 13:48:34.501] [alevinLog] [info] Total 349(has 201 low confidence) barcodes
[2020-06-03 13:48:35.369] [alevinLog] [info] Done True Barcode Sampling
[2020-06-03 13:48:35.441] [alevinLog] [warning] Total 73.3629% reads will be thrown away because of noisy Cellular barcodes.
[2020-06-03 13:48:35.454] [alevinLog] [info] Done populating Z matrix
[2020-06-03 13:48:35.455] [alevinLog] [info] Total 4286 CB got sequence corrected
[2020-06-03 13:48:35.455] [alevinLog] [info] Done indexing Barcodes
[2020-06-03 13:48:35.455] [alevinLog] [info] Total Unique barcodes found: 604589
[2020-06-03 13:48:35.455] [alevinLog] [info] Used Barcodes except Whitelist: 4282
[2020-06-03 13:48:35.558] [alevinLog] [info] Done with Barcode Processing; Moving to Quantify
...
processed 52 Million fragments
hits: 0, hits per frag: 0[2020-06-03 13:49:37.892] [jointLog] [info] Computed 0 rich equivalence classes for further processing
[2020-06-03 13:49:37.892] [jointLog] [info] Counted 0 total reads in the equivalence classes
[2020-06-03 13:49:37.893] [jointLog] [info] Number of fragments discarded because they are best-mapped to decoys : 0
[2020-06-03 13:49:37.893] [jointLog] [warning] Found 370 reads withN
in the UMI sequence and ignored the reads.
Please report on github if this number is too large
[2020-06-03 13:49:37.893] [jointLog] [info] Mapping rate = 0%[2020-06-03 13:49:37.893] [jointLog] [info] finished quantifyLibrary()
[2020-06-03 13:49:37.899] [alevinLog] [info] Starting optimizer[2020-06-03 13:49:38.613] [alevinLog] [warning] mrna file not provided; using is 1 less feature for whitelisting
[2020-06-03 13:49:38.613] [alevinLog] [warning] rrna file not provided; using is 1 less feature for whitelisting
[2020-06-03 13:49:38.614] [alevinLog] [info] Total 0.00 UMI after deduplicating.
[2020-06-03 13:49:38.614] [alevinLog] [info] Total 0 BiDirected Edges.
[2020-06-03 13:49:38.614] [alevinLog] [info] Total 0 UniDirected Edges.
[2020-06-03 13:49:38.614] [alevinLog] [warning] Skipped 348 barcodes due to No mapped read
[2020-06-03 13:49:38.614] [alevinLog] [info] Clearing EqMap; Might take some time.
[2020-06-03 13:49:38.620] [alevinLog] [warning] Num Low confidence barcodes too less 1 < 200.Can't performing whitelisting; Skipping
[2020-06-03 13:49:38.620] [alevinLog] [info] Finished optimizer
Floating point exception (core dumped)
`
Any suggestions on how to get this working are highly appreciated!
Thanks