How to extract all variant alleles that do not match "./." from the GT column of a vcf file?

14 Views Asked by At

For the two vcf files linked below, I cannot find any variants in the GT column other than "./.". Is it possible to confirm if the GT column of vcf files have been annotated (i.e variants listed as "./1", "1/." or "1/1")?

vcf files: (https://zenodo.org/records/6558593, LUD.TH179.PASS.dbsnp_cosmic.vcf.gz and LUD.TH238.PASS.dbsnp_cosmic.vcf.gz).

here are a few terminal commands I have tried. The output for the commands below are all 0, with no variant information under the columns of the outputted vcf data. I would assume, it means that the GT column was not annotated, but I am not sure.

  1. remove homozygous ref genotype from multi-sample vcf
  • bcftools view -i 'GT[*]="alt"' LUD.TH179.PASS.dbsnp_cosmic.vcf.gz | less -SN
  1. extracting heterozygous snp from a vcf file
  • vcftools --gzvcf LUD.TH179.PASS.dbsnp_cosmic.vcf.gz --extract-FORMAT-info GT | grep "0/1"
  1. ID heterozygous variants in VCF file using vcftools
  • vcftools --gzvcf LUD.TH179.PASS.dbsnp_cosmic.vcf.gz --het
  • output (cat out.het)
    • INDV O(HOM) E(HOM) N_SITES F (EMPTY)
0

There are 0 best solutions below