How to search for a text in specific files in unix

57.2k Views Asked by At

I am using Ubuntu machine and tried with below commands to search for a text:

This command to check if the word is present in a given directory recursively:

1) Here <hello> is the word which I am search for and it searches recursively in all files starting from current directory. It is working fine.

grep -r "<hello>" .

2) Now I want to restrict the search to only specific files, say to xml files only:

grep --include=\*.{java} -rnw '/home/myfolder/' -e "<hello>"

This time the command is taking more time and finally not giving any results. But my files has the content.

I have gone through this link - How do I find all files containing specific text on Linux? for writing my second command.

Is there any issue with my second command? Also is there an alternate command that performs fast?

3

There are 3 best solutions below

6
On BEST ANSWER

It might be better to use find, since grep's include/exclude can get a bit confusing:

find -type f -name "*.xml" -exec grep -l 'hello' {} +

This looks for files whose name finishes with .xml and performs a grep 'hello' on them. With -l (L) we make the file name to be printed, without the matched line.

Explanation

  • find -type f this finds files in the given directory structure.
  • -name "*.xml" selects those files whose name finishes with .xml.
  • -exec execute a command on every result of the find command.
  • -exec grep -l 'hello' {} + execute grep -l 'hello' on the given file. With {} + we are refering to the matched name (it is like doing grep 'hello' file but refering to the name of the file provided by the find command). Also, grep -l (L) returns the file name, not the match itself.
0
On

Ok, so the problem is - XML is not plain text, however similar it looks. It's therefore not really suitable for 'conventional' grepping.

Can I suggest having a look at [xml_grep][1] which is a utility that comes with the XML::Twig package for this purpose?

Or if you're able to give more specific examples of what your source content and desired outputs are, we can give more specific answers.

Anyway, other than that - I wouldn't do a recursive grep, but rather a find -exec. find lets you filter files first, and is quite efficient... but there really is no getting around the fact that you'll have to read every file that matches to check.

0
On

This is working for me, searching *.xml and *.java files, with GNU grep:

grep --include=\*.{xml,java} -rl '/path' -e 'hello'

In your question you had -w as flag, that means to match the whole word.