How to create a efficient encode/decode unique ID in PHP

6.3k Views Asked by At

I am trying to find a way to encode a database ID into a short URL, e.g. 1 should become "Ys47R". Then I would like to decode it back from "Ys47R" to 1 so I can run a database search using the INT value. It needs to be unique using the database ID. The sequence should not be easily guessable such as 1 = "Ys47R", 2 = "Ys47S". It should be something along the lines of YouTube or bitly's URL's. I have read up on hundreds of different sources using md5, base32, base64 and `bcpow but have come up empty.

This blog post looked promising but once I added padding and a passkey, short ID's such as 1 became SDDDG, 2 became "SDDDH" and 3 became "SDDDI". It is not very random.

base32 used only a-b 0-9 base64 had characters such as == on the end.

I then tried this:

function getRandomString($db, $length = 7) {

    $validCharacters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    $validCharNumber = strlen($validCharacters);
    $result = "";

    for ($i = 0; $i < $length; $i++) {
        $index = mt_rand(0, $validCharNumber - 1);
        $result .= $validCharacters[$index];
    }

Which worked but meant I had to run a database query every time to make sure there were no collisions and it did not exist in the database.

Is there a way I can create short ID's that are 4 characters minimum with a charset of [a-z][A-Z][0-9] that can be encoded and decoded back, using increment unique ID in a database where each number is unique. I can't get my head around advance techniques using base32 or base64.

Or am I looking into this too much and there is an easier way to do it? Would it be best to do the random string function above and query the database to check for uniqueness all the time?

2

There are 2 best solutions below

3
On

You could use function from comments: http://php.net/manual/en/function.base-convert.php#106546

$initial = '11111111';
$dic = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
var_dump($converted = convBase($initial, '0123456789', $dic)); 
// string(4) "KCvt"
var_dump(convBase($converted, $dic, '0123456789')); 
// string(8) "11111111"

function convBase($numberInput, $fromBaseInput, $toBaseInput)
{
    if ($fromBaseInput==$toBaseInput) return $numberInput;
    $fromBase = str_split($fromBaseInput,1);
    $toBase = str_split($toBaseInput,1);
    $number = str_split($numberInput,1);
    $fromLen=strlen($fromBaseInput);
    $toLen=strlen($toBaseInput);
    $numberLen=strlen($numberInput);
    $retval='';
    if ($toBaseInput == '0123456789')
    {
        $retval=0;
        for ($i = 1;$i <= $numberLen; $i++)
            $retval = bcadd($retval, bcmul(array_search($number[$i-1], $fromBase),bcpow($fromLen,$numberLen-$i)));
        return $retval;
    }
    if ($fromBaseInput != '0123456789')
        $base10=convBase($numberInput, $fromBaseInput, '0123456789');
    else
        $base10 = $numberInput;
    if ($base10<strlen($toBaseInput))
        return $toBase[$base10];
    while($base10 != '0')
    {
        $retval = $toBase[bcmod($base10,$toLen)].$retval;
        $base10 = bcdiv($base10,$toLen,0);
    }
    return $retval;
}
4
On

If you want some symmetric obfuscation, then base_convert() is often sufficient.

base_convert($id, 10, 36);

Will return strings like 1i0g and convert them back.

Before and after that base conversion you can add:

  • To get a minimum string length, I'd suggest just adding 70000 to your $id. And on the receiving end just subtract that again.

  • A minor multiplication $id *= 3 would add some "holes" in the generated alphanumeric ID range, yet not exhaust the available string space.

  • For some appearance of arbitrariness, a bit of nibble moving:

    $id = ($id & 0xF0F0F0F) << 4    
        | ($id & 0x0F0F0F0) >> 4;
    

    Which works for generating your obfuscated ID strings, and getting back the original ones.

    Just to be crystal clear: this is no encryption of any sort. It just shifts numeric jumps between consecutive numbers, and looks slightly more arbitrary.

You still may not like the answer, but generating random IDs in your database is the only approach that really hinders ID guessing.