AWK how to count patterns on the first column?

195 Views Asked by At

I was trying get the total number of "??", " M", "A" and "D" from this:

?? this is a sentence
 M this is another one
A  more text here
D  more and more text

I have this sample line of code but doesn't work:

 awk -v pattern="\?\?" '{$1 == pattern} END{print " "FNR}'
1

There are 1 best solutions below

0
On
$ awk '{ print $1 }' file | sort | uniq -c 
1 ??
1 A
1 D
1 M

If for some reason you want an awk-only solution:

awk '{ ++cnt[$1] } END { for (i in cnt) print cnt[i], i }' file

but I think that's needlessly complicated compared to using the built-in unix tools that already do most of the work.

If you just want to count one particular value:

awk -v value='??' '$1 == value' file | wc -l

If you want to count only a subset of values, you can use a regex:

$ awk -v pattern='A|D|(\\?\\?)' '$1 ~ pattern { print $1 }' file | sort | uniq -c
1 ??
1 A
1 D

Here you do need to send a \ in order that the ?s are escaped within the regular expression. And because the \ is itself a special character within the string being passed to awk, you need to escape it first (hence the double backslash).