Skip to content

Deep sequencing #62

@KBT59

Description

@KBT59

I am interested in training DeepVariant for deep sequencing on a capture panel. We are interested in lower frequency variants - say 1%. Depth of coverage is on the order of 1000 to 1700 for the data I am using. I have set the default height of the pileup tensors to 2000 via
https://github.com/google/deepvariant/blob/r0.5/deepvariant/make_examples.py#L177

In a set with 497 confirmed 'true' variants I'm getting a much smaller number of variants out of make_examples:

I0404 17:02:18.420840 140137671104256 make_examples.py:1032] Found 487 candidate variants
I0404 17:02:18.421224 140137671104256 make_examples.py:620] ----- VariantCounts -----
I0404 17:02:18.421346 140137671104256 make_examples.py:624] All: 29/29 (100.00%)
I0404 17:02:18.421475 140137671104256 make_examples.py:624] SNPs: 27/29 (93.10%)
I0404 17:02:18.421593 140137671104256 make_examples.py:624] Indels: 2/29 (6.90%)
I0404 17:02:18.421717 140137671104256 make_examples.py:624] BiAllelic: 29/29 (100.00%)
I0404 17:02:18.421834 140137671104256 make_examples.py:624] MultiAllelic: 0/29 (0.00%)
I0404 17:02:18.421953 140137671104256 make_examples.py:624] HomRef: 28/29 (96.55%)
I0404 17:02:18.422069 140137671104256 make_examples.py:624] Het: 1/29 (3.45%)

What, besides setting the pileup height to match my data, should I be looking at?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions