Command line to consider common values in only in specific column

156 Views Asked by At

I am looking for an simple command line to help me with the following task.

I have two files and I would like to print the lines for which they have a value in Col2 in common.

For instance File1 is similar to the following 3-column tab separated example

File1

cat big 24
cat small   13
cat red 63

File2

dog big 34
chicken plays   39
fish    red 294

desired output

big
red

I have tried commands using the commsyntax: comm /path/to/file1/ /path/to/file2 However, it does not output me anything because the values in Col1 and Col3 will very rarely be in common. Does anyone have a suggestion as to how this can be solved, maybe awk is a better solution?

2

There are 2 best solutions below

0
On BEST ANSWER

if you read the man page of comm, you will see it works with sorted files. But awk is flexible, you can control what you want:

 awk 'NR==FNR{a[$2]=1;next}a[$2]{print $2}' file1 file2
2
On

You could do it in a single pass with paste and awk:

paste file1 file2 | awk '$2 == $5 { print $2 }'

Output:

big
red