grep: search words from files in subfolders but exclude given word from total count of matches

76 Views Asked by At

I have a folder named folder. Under folder I have two subfolders subfolder1 and subfolder2.

Both of these subfolders have the same text file file.txt. That text file has following lines:

text
text
line
line
text text
text text

What I am trying to do with grep is to get the total count of text words but exclude text text words from the count.

If I run grep -ro "text" folder/ | wc -l | xargs echo "total matches :" I get the count of 12 but the result I am looking for is 4 because those two files have only two text words resulting to total of 4.

I have tried to run grep -ro "text" -v "text text" folder/ | wc -l | xargs echo "total matches :" and many other syntaxes with -v to exclude text text from the count with no success.

2

There are 2 best solutions below

8
On

It is easier to achieve it using awk , In short you want to print(count) the line where "text" appears only once :

  • use "text"(-F "text") as field separator
  • print the lines where the number of fields are 2 when "text" is a field separator.
awk -F "text" 'NF==2 { print}' folder/subfolder*/*| wc -l  | xargs echo "total matches :"
total matches : 4

0
On

If you have grep -P, you can use negative lookarounds;

grep -Pro '(?<=text )text(?! text)' folder

If your example data is representative, you can replace the -o and the pipe to wc -l with grep -c.

If you don't have grep -P,

grep -r 'text' folder | grep -vc 'text.*text'

(thanks to @thanasisp) or maybe switch to Perl, or sed

find folder -type f -exec sed -n 's/text text//;/text/p' {} +

For what it's worth, the -v option of grep selects inverted behavior for all the patterns you specify. So

grep -e foo -v -e bar files...

is equivalent to

grep -v -e foo -e bar files...

i.e.

grep -Ev 'foo|bar' files...