Linux disk usage of files that match expression

1.6k Views Asked by At

I have a mounted hard-drive with multi-users like this:

/HDD1/user1
/HDD1/user2
/HDD1/user3

I'd like to look in each user's folder, find all the files that match an expression (say, "*.txt"), and then sum the space used by all those files, and report it on a per user base:

user1: x bytes
user2: y bytes
user3: z bytes

I found the directories of all the files with:

find /HDD1/ -name "*.txt" | rev | cut -d"/" -f2- | rev | uniq > txtfiles.dat

I thought I'd use a loop to go through each line in txtfiles.dat calculating the disk usage in each folder, but this seems very cumbersome. Is there a neater way to do this? Something like a du that looks in each user's folder but only counting the files that match an expression?

5

There are 5 best solutions below

2
On
find /HDD1/ -name "*.txt" -print0 | du -ch --files0-from -

Explanation: the find command takes care of filtering the file set, and specifying -print0 makes the output NUL-terminated.

The find output is piped to du. Specifying --files0-from - indicates that file paths are read in a NUL-terminated stream from STDIN. Finally, -ch instructs du to add a grand total line, in human-readable form.

If all you want is the total, you can pipe the result to tail -1:

find /HDD1/ -name "*.txt" -print0 | du -ch --files0-from - | tail -1
0
On

du takes a list, and will size individual files if given the -a option and produce a total with the -c option.

In bash shell $(cmd) is the output of running cmd.

So, putting all that together, to get the sizes of all the .txt files, one could run:

du -ac $(find . -name '*.txt')

4
On

You can loop through each user and sum the sizes like so:

for username in `ls /HDD1/`; do
  find $username -iname *.txt -printf '%s\n' | awk '{s+=$1}END{print s}' -
done

The outer loop is for the users, then the find does the searching for files and prints their sizes out, and finally awk sums up all the sizes and prints out the result.

0
On

The find command (on most systems) can print the size of a file in bytes. You can then just total this, eg with awk.

for userdir in /HDD1/*
do find "$userdir" -type f -name '*.txt' -printf '%s\n' |
   awk -v u=$(basename "$userdir") '
        {tot+=$1}
        END{print u ": " tot " bytes"}
   '
done

This provides the output format you asked for (user1: x bytes). If the number of bytes is a bit huge, divide by 1000 or 1024 to get kilo or kibi bytes etc. (in the awk print tot/1000 instead of tot).

0
On

I pulled together bits and pieces from all your answers into this:

for n in /HDD1/*
  do
  uname=$(echo $n | cut -d'/' -f4)
  space=$(find /HDD1/$uname -name "*.txt" -print0 | du -ch --files0- from - | tail -1)
  echo $uname: $space
done

Which seems to be working fine. Thanks for all your input.