Perl combine MD5 / SHA2 sum from multiple files to a MD5 / SHA2 sum

1.2k Views Asked by At

Below is the code which generates MD5 / SHA2 sum of individual files present under directory or sub directories recursively.

#!/usr/bin/perl -w
use strict;
use warnings;
use File::Find;
use IO::File;
use Digest::MD5;
use Digest::SHA qw(sha256_hex);

find({ wanted => \&process_file, no_chdir => 1 }, @ARGV);

sub process_file {
    #my $md5 = Digest::MD5->new;
    my $sha2 = Digest::SHA->new(256);
    if (-f $_) {
        #print "This is a file: $_\n";
        open(FILE, $_) or die "Can not open $_";
        binmode(FILE);
        #my $md5sum = $md5->addfile(*FILE)->hexdigest;
        my $sha2sum = $sha2->addfile(*FILE)->hexdigest;
        #print sha256_hex(*FILE), "  $_\n";
        close FILE;
        print "$sha2sum  $_\n";
    }
}

The output of above code is given below.

~$ perl list.pl src
f21e1caa364eaad195d968d28187d5cf1a58c0b7b1f21a8ebcb9ca2539dde175  src/test1.pl
4b3277ec41ba0ff8ed6f9f2593c42e08c2f4e9b66df0d63de7c91559ff7e86fa  src/random.py
076231fcbe5887a163278b757f99fb05b27163775ec4706cb2365de3be0906ac  src/test.pl
8806c9f58fc91b2e1d6453a7af7e4f9f8b94e2d0f67a84a89b35bfbf517399be  src/size.pl
5a1b2080ecc53ced45ed3aa13e47118a9ca2f8505b1e89485b6b681d8e1d264c  src/test2.py
5f7c1ff9c7b3dd32f75558dd30324ec085c45a0d0c62190b9a96f211cdf216ea  src/java/test3.class
3728ee1a86443fffe9eafd84db82ce68c9640a0a984958f579b0da1a74283d7c  src/java/test4.wav
d7169ffbb231e93f47d1c54fddf2144b459bba228de48c30b4bc5a4d297be6fb  src/java/test5.java

Updated code to support sha256sum generation.

Now I want to generate a combined MD5 / SHA2 sum from these MD5 / SHA2 sums as input.

2

There are 2 best solutions below

0
On

Try:

use File::Find 'find';
use Digest::SHA 'sha256_hex';

my @allsums;

sub process_file {
  push @allsums, Digest::SHA->new(256)->addfile($_)->hexdigest . " $_" if -f $_;
}

find({ wanted => \&process_file, no_chdir => 1 }, @ARGV);

print sha256_hex(join ':', sort @allsums), "\n";
1
On
  1. Digest::MD5 was first released as a Core module with perl v5.7.3 (March 2002) [1]. The oldest version of perl being widely used today is v5.8.8, so any perl you are going to encounter will have this module available.

  2. The oldest version of Digest::MD5 which I could find (v1.99.59-TRIAL from 1998) already has the add and addfile methods. So whatever version of that module you encounter, you will have the add method available.

You can therefore safely rely on that functionality, instead of having to use some ugly and unportable hack like calling a command line tool.

Make sure that you traverse each directory in a specific order so that the checksum is reproducible.

Note that MD5 is an effectively broken algorithm, which shouldn't be used except to interface with legacy systems. The SHA-2 family of hash functions is preferable for most tasks where a fast hash is required.


[1] Use the corelist command line tool from Module::Corelist to query core modules of different perl versions.