I want to grep a vcf file for search for multiple positions. The following works:
grep -f template_gb37 file.vcf>gb37_result
My template_gb37 has 10000 lines and it looks like this:
1 1156131 rs2887286 C T
1 1211292 rs6685064 T C
1 2283896 rs2840528 A G
When the vcf has the rs it works perfect.
The problem is that the vcf I am going to grep may not have the rs and "." instead:
File.vcf
#CHROM POS ID REF ALT ....
1 1156131 . C T ....
1 1211292 . T C ....
1 1211292 . T C ....
Is there a way to search my multiple patterns with "rs" or just "."?
Thanks in advance
I think you mean the second field in your file could be
.
orrsNNNNNN
and you want to allow either. So, I think you need an "alternation" which you do with a|
like this:So your pattern file
"template_gb37"
needs to look like this:And you need to search with:
If you don't want to change your pattern file, you can edit it "on-the-fly" each time you use it. So, if
"template"
currently looks like this:the following
awk
will edit it:to make it this:
which means you could use my whole answer like this: