-
Notifications
You must be signed in to change notification settings - Fork 756
Description
I am interested in training DeepVariant for deep sequencing on a capture panel. We are interested in lower frequency variants - say 1%. Depth of coverage is on the order of 1000 to 1700 for the data I am using. I have set the default height of the pileup tensors to 2000 via
https://github.com/google/deepvariant/blob/r0.5/deepvariant/make_examples.py#L177
In a set with 497 confirmed 'true' variants I'm getting a much smaller number of variants out of make_examples:
I0404 17:02:18.420840 140137671104256 make_examples.py:1032] Found 487 candidate variants
I0404 17:02:18.421224 140137671104256 make_examples.py:620] ----- VariantCounts -----
I0404 17:02:18.421346 140137671104256 make_examples.py:624] All: 29/29 (100.00%)
I0404 17:02:18.421475 140137671104256 make_examples.py:624] SNPs: 27/29 (93.10%)
I0404 17:02:18.421593 140137671104256 make_examples.py:624] Indels: 2/29 (6.90%)
I0404 17:02:18.421717 140137671104256 make_examples.py:624] BiAllelic: 29/29 (100.00%)
I0404 17:02:18.421834 140137671104256 make_examples.py:624] MultiAllelic: 0/29 (0.00%)
I0404 17:02:18.421953 140137671104256 make_examples.py:624] HomRef: 28/29 (96.55%)
I0404 17:02:18.422069 140137671104256 make_examples.py:624] Het: 1/29 (3.45%)
What, besides setting the pileup height to match my data, should I be looking at?