I have some 5 million text files under a directory - all of the same format (nothing special, just plain text files with some integers in each line). I would like to compute the maximum and minimum line count amongst all these files, along with its two filenames (the one for max and another for min).
I started out by trying to write out all the line count like so (and then workout how to find the min and max from this list):
wc -l `find /some/data/dir/with/text/files/ -type f` > report.txt
but this throws me an error:
bash: /usr/bin/wc: Argument list too long
Perhaps there is a better way to go about this?
There is a limit to the argument list length. Since you have several millions files passed to wc, the command certainly crossed this line.
Better invoke
find -exec COMMAND
instead:Here, each found file
find
will be appended to the argument list of the command following-exec
in place of{}
. Before the argument length is reached, the command is run and the remaining found files will be processed in a new run of the command the same way, until the whole list is done.See man page of
find
for more details.Thanks to Charles Duffy for the improvements of this answer.