Algorithms for visual representations of checksums (e.g. SHA)

439 Views Asked by At

I want to generate something like VisualHostKey for a SHA checksum. But it should work with any hexadecimal checksum.

The generated artifact could be an ASCII art, a 2D colour palette, or just some random garbage in a PNG. Personally I like the VisualHostKey approach but I am open for suggestions.

The idea is to be able to quickly identify that two checksums are the same using just the human eye. And when faced with a bunch of sums, quickly find the one you are looking for.

2

There are 2 best solutions below

0
On

You could just use the actual OpenSSH VisualHostKey code, which is found in the key_fingerprint_randomart() function in the key.c file in the OpenSSH source code. The algorithm is fairly simple, and can take any array of bytes as input. In OpenSSH, the input is a cryptographic hash of the key; you could do the same.

(As defined in the OpenSSH source code, the function also takes a pointer to the key structure itself, but that's only used to print the type and size of the key at the top of the picture.)

In fact, since the code is freely licensed, let me just include a copy here. This is extracted from OpenSSH 6.1, $OpenBSD: key.c,v 1.99 2012/05/23 03:28:28 djm Exp $:

/*
 * Copyright (c) 2000, 2001 Markus Friedl.  All rights reserved.
 * Copyright (c) 2008 Alexander von Gernler.  All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
 * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
 * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
 * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */

/*
 * Draw an ASCII-Art representing the fingerprint so human brain can
 * profit from its built-in pattern recognition ability.
 * This technique is called "random art" and can be found in some
 * scientific publications like this original paper:
 *
 * "Hash Visualization: a New Technique to improve Real-World Security",
 * Perrig A. and Song D., 1999, International Workshop on Cryptographic
 * Techniques and E-Commerce (CrypTEC '99)
 * sparrow.ece.cmu.edu/~adrian/projects/validation/validation.pdf
 *
 * The subject came up in a talk by Dan Kaminsky, too.
 *
 * If you see the picture is different, the key is different.
 * If the picture looks the same, you still know nothing.
 *
 * The algorithm used here is a worm crawling over a discrete plane,
 * leaving a trace (augmenting the field) everywhere it goes.
 * Movement is taken from dgst_raw 2bit-wise.  Bumping into walls
 * makes the respective movement vector be ignored for this turn.
 * Graphs are not unambiguous, because circles in graphs can be
 * walked in either direction.
 */

/*
 * Field sizes for the random art.  Have to be odd, so the starting point
 * can be in the exact middle of the picture, and FLDBASE should be >=8 .
 * Else pictures would be too dense, and drawing the frame would
 * fail, too, because the key type would not fit in anymore.
 */
#define FLDBASE         8
#define FLDSIZE_Y       (FLDBASE + 1)
#define FLDSIZE_X       (FLDBASE * 2 + 1)
static char *
key_fingerprint_randomart(u_char *dgst_raw, u_int dgst_raw_len, const Key *k)
{
        /*
         * Chars to be used after each other every time the worm
         * intersects with itself.  Matter of taste.
         */
        char    *augmentation_string = " .o+=*BOX@%&#/^SE";
        char    *retval, *p;
        u_char   field[FLDSIZE_X][FLDSIZE_Y];
        u_int    i, b;
        int      x, y;
        size_t   len = strlen(augmentation_string) - 1;

        retval = xcalloc(1, (FLDSIZE_X + 3) * (FLDSIZE_Y + 2));

        /* initialize field */
        memset(field, 0, FLDSIZE_X * FLDSIZE_Y * sizeof(char));
        x = FLDSIZE_X / 2;
        y = FLDSIZE_Y / 2;

        /* process raw key */
        for (i = 0; i < dgst_raw_len; i++) {
                int input;
                /* each byte conveys four 2-bit move commands */
                input = dgst_raw[i];
                for (b = 0; b < 4; b++) {
                        /* evaluate 2 bit, rest is shifted later */
                        x += (input & 0x1) ? 1 : -1;
                        y += (input & 0x2) ? 1 : -1;

                        /* assure we are still in bounds */
                        x = MAX(x, 0);
                        y = MAX(y, 0);
                        x = MIN(x, FLDSIZE_X - 1);
                        y = MIN(y, FLDSIZE_Y - 1);

                        /* augment the field */
                        if (field[x][y] < len - 2)
                                field[x][y]++;
                        input = input >> 2;
                }
        }

        /* mark starting point and end point*/
        field[FLDSIZE_X / 2][FLDSIZE_Y / 2] = len - 1;
        field[x][y] = len;

        /* fill in retval */
        snprintf(retval, FLDSIZE_X, "+--[%4s %4u]", key_type(k), key_size(k));
        p = strchr(retval, '\0');

        /* output upper border */
        for (i = p - retval - 1; i < FLDSIZE_X; i++)
                *p++ = '-';
        *p++ = '+';
        *p++ = '\n';

        /* output content */
        for (y = 0; y < FLDSIZE_Y; y++) {
                *p++ = '|';
                for (x = 0; x < FLDSIZE_X; x++)
                        *p++ = augmentation_string[MIN(field[x][y], len)];
                *p++ = '|';
                *p++ = '\n';
        }

        /* output lower border */
        *p++ = '+';
        for (i = 0; i < FLDSIZE_X; i++)
                *p++ = '-';
        *p++ = '+';

        return retval;
}

It doesn't seem to have an significant dependencies on the rest of the OpenSSH code, aside from the const Key *k parameter, which is only used on one line as an argument to the key_type() and key_size() functions (or macros?). The nonstandard types u_char and u_int appear to be just aliases for unsigned char and unsigned int respectively, and the xcalloc() function seems to be just a replacement or wrapper for the standard calloc().

2
On

Generally this is accomplished by making an image generation function of some kind which accepts a seed. You then hash some data and then seed your image generator with the result. This will prevent it from making random garbage in a PNG and will give you something distinguishable.