-
Notifications
You must be signed in to change notification settings - Fork 68
Closed
Description
Hi, thanks a report by @apcamargo at wwood/galah#7 I came across an issue with --minFraction
on these fragmented genomes. They seem to align well:
$ fastANI -q a1.fna -r 2.fna -o /dev/stdout --minFraction 0.2 2>/dev/null
1.fna 2.fna 97.4762 228 629
$ fastANI -r 1.fna -q 29.fna -o /dev/stdout 2>/dev/null
2.fna 1.fna 98.351 232 255
But when --minFraction
is used the hit goes away. This is even though 232/255 > 0.5:
$ fastANI -q 1.fna -r 2.fna -o /dev/stdout --minFraction 0.5 2>/dev/null
$ fastANI -r 1.fna -q 2.fna -o /dev/stdout --minFraction 0.5 2>/dev/null
$ fastANI -q 1.fna -r 2.fna -o /dev/stdout --minFraction 0.5 --fragLen 1000 2>/dev/null
1.fna 2.fna 98.2643 1113 2276
(galah-dev) ben@u2:~/git/galah$ fastANI --version
version 1.31
(galah-dev) ben@u2:~/git/galah/antonio$ seqstat 1.fna
seqstat - show some simple statistics on a sequence file
SQUID 1.9g (January 2003)
Copyright (C) 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Format: FASTA
Type (of 1st seq): DNA
Number of sequences: 369
Total # residues: 2456354
Smallest: 2028
Largest: 32450
Average length: 6656.8
(galah-dev) ben@u2:~/git/galah/antonio$ seqstat 2.fna
seqstat - show some simple statistics on a sequence file
SQUID 1.9g (January 2003)
Copyright (C) 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Format: FASTA
Type (of 1st seq): DNA
Number of sequences: 409
Total # residues: 1468919
Smallest: 2005
Largest: 16940
Average length: 3591.5
Filtering out this alignment by the minFraction seems incorrect to me. I wonder what the definition of the minFraction actually is. Is it the fraction of the total genome length or the fraction of the genome that is long enough to be included as a fragment, or something along those lines?
Thanks, ben
Metadata
Metadata
Assignees
Labels
No labels