Requirement: To get the count of directories under the input directory that matches the following criteria
- the directories can have any name except "DIR1", "DIR2", "DIR3" etc.
- the directories inside "DIR1", "DIR2", "DIR3" etc. need not be counted
- need the count of directories alone, no files
use strict;
use File::Find;
my ($inputdir) = @ARGV;
my (@branches, $branch, $directory, @directories);
my $count = 0;
find(\&wanted, $inputdir);
while ( defined($directory = shift @directories) ) {
if (-d $directory){
next if ($directory =~ "DIR1" || $directory =~ "DIR2" || $directory =~ "DIR3");
push @branches, $directory;
$count++;
}
}
print "Total number of directories: $count \n";
sub wanted{
push @directories, $File::Find::name;
return @directories;
}
This piece of code is giving the required output but it's taking quite a lot of time.
Please suggest ways to reduce the time taken by improving this code.
The File::Find::Rule can skip whole branches altogether
This still has to take some time with a large filesystem but it will be much better.
In a one-liner, if all you need from this is just a quick count
where I've consolidated some of the code from the script.
The next step would be to use multi-threaded execution (I'd use
forkhere). Group subdirectories so that they are roughly balanced in their sub-counts and run something like the above in parallel over those groups. The gain will depend on your hardware but there should be a good speedup factor.