-
Notifications
You must be signed in to change notification settings - Fork 453
Closed
Description
One of those "all day and late into the night" bugs 😎
If you
- Fill out a
bcf1_t
record 'from scratch' (through a bunch ofbcf_update_
function calls, rather than reading an existing VCF/BCF file) bcf_dup()
the record- Make further
bcf_update_
changes to the record - Write the record out to a file.
The updates made in step 3 are not reflected in the output. I think the reason is that the bcf_dup()
in step 2 has the side effect of leaving the record in a 'packed and never unpacked' state, since it packs the fields in order to duplicate them, but nothing was ever unpacked since the record was created de novo. Later, in step 4, bcf1_sync goes to pack the new updates of step 3, but noticing that nothing was ever unpacked, first unpacks the stale data and overwrites the updates!
Here's a dumb repro program,
#include "vcf.h"
int main() {
/* prelude -- create a toy bcf1_t record from scratch */
bcf_hdr_t* hdr = bcf_hdr_init("w1");
bcf1_t* rec = bcf_init();
bcf_hdr_append(hdr, "##FORMAT=<ID=GT,Number=1,Type=String,Description=\"Genotype\">");
bcf_hdr_append(hdr, "##contig=<ID=chr1,length=1000>");
bcf_hdr_add_sample(hdr, "Alice");
bcf_hdr_sync(hdr);
rec->rid = 0;
rec->pos = 99;
rec->rlen = 1;
const char *alleles[] = { "A", "G" };
bcf_update_alleles(hdr, rec, alleles, 2);
const int32_t gt[] = { bcf_gt_unphased(0), bcf_gt_unphased(0) };
bcf_update_genotypes(hdr, rec, gt, 2);
/* here is the bug -- first, perform a no-op bcf_dup*/
bcf_dup(rec);
/* then set the record's ID */
bcf_update_id(hdr, rec, "HelloWorld");
/* write the header & record out to standard output */
htsFile *stdout = vcf_open("-", "w0");
bcf_hdr_write(stdout, hdr);
bcf_write(stdout, hdr, rec);
bcf_close(stdout);
}
If the bcf_dup()
is there, we get the record with no ID:
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=chr1,length=1000>
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Alice
chr1 100 . A G 0 . . GT 0/0
Comment out the bcf_dup()
and:
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=chr1,length=1000>
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Alice
chr1 100 HelloWorld A G 0 . . GT 0/0
Metadata
Metadata
Assignees
Labels
No labels