Changes
- More custom VEP annotation types tedil (84)
- Small performance improvement for `keep_unmatched` tedil (83)
- make FILTER a list tedil christopher-schroeder (82)
We also added function families for working with genotypes (`FORMAT["GT"]`): `count_hom`, `count_het`, `count_any_ref`, `count_any_var`, `count_hom_ref`, `count_hom_var` and `is_hom`, `is_het`, `is_hom_ref`, `is_hom_var`, `has_ref`, `has_var`. These are similar in semantics to the `countHom` etc functions found in SnpSift with the exception that unknown genotype information (i.e. `.` in VCF notation) is **not** considered to be `REF` (i.e. `0` in VCF notation) but... unknown. Consider the following VCF snippet with 27 triploid samples in all possible combinations of `0` (ref) allele, `1` (first alternative) allele and `.` (missing information):
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1 S2 S3 S4 S5 S6 S7 S8 S9 T1 T2 T3 T4 T5 T6 T7 T8 T9 U1 U2 U3 U4 U5 U6 U7 U8 U9
chr18 27963423 . G A 276 PASS GT ././. 1/1/. 0/0/. 0/1/. 1/0/. ./0/. ./1/. 1/./. 0/./. ././0 1/1/0 0/0/0 0/1/0 1/0/0 ./0/0 ./1/0 1/./0 0/./0 ././1 1/1/1 0/0/1 0/1/1 1/0/1 ./0/1 ./1/1 1/./1 0/./1
vembrane reports the following counts:
- `count_hom() == 2` (only `0/0/0` and `1/1/1` are considered homozygous)
- `count_het() == 12`
- `count_any_ref() == 19`
- `count_any_var() == 19`
- `count_hom_ref() == 1`
- `count_hom_var() == 1`
SnpSift reports the following counts:
- `countHom() == 3` (`././.` is considered homozygous in addition to `0/0/0` and `1/1/1`)
- `countHet() == 24`
- `countRef() == 8` (since `.` is considered `0`)
- `countVariant() == 19`