Transforming sequential integers to non-sequential 4-character tokens is fairly easy. If you use a reversible algorithm, then you can also easily transform these tokens back into sequential integers that could be used to retrieve URLs from a database.
Note: If you're planning to open up a public URL shortening service, then tokens of just 4 alphanumeric characters could be exhausted rather quickly. But for a personal or company website, they should be more than adequate. The method described below will also work for longer tokens, but if you really need a system that can store billions (or trillions) of URLs then you'll need to think more carefully about how you're going to organize all this data.
A linear congruential generator is a good way of obfuscating numbers. If you're working with the range from 0 to 624−1, then obviously your modulus m will be 624. And since the prime factors of m are 2 and 31, the multiplier a will have to be one more than a multiple of 124 (as explained here). The value of c can be any non-zero value that is relatively prime to m. For example:
function lcg($n) {
# (10345073 - 1) % 124 == 0
$m = 14776336; # = 62**4
$a = 10345073;
$c = 8912423;
$n = ($n * $a + $c) % $m;
return $n;
}
The inverse function is fairly similar. Instead of a, it uses its modular multiplicative inverse (mod m), and instead of c, it uses m−c:
function lcg_inv($n) {
# (10345073 * 5661345) % (62**4) == 1
$m = 14776336; # = 62**4
$a_ = 5661345;
$c_ = 5863913; # = $m-8912423
$n = (($n + $c_) * $a_) % $m;
return $n;
}
Since LCGs are quite easy to predict from just a few output values, you can add another layer of obfuscation by randomizing the order of symbols used to represent these numbers in base 62 (e.g., W3qVL... instead of abcde...)
function int_2_token($n) {
$alf = 'W3qVLpEKDxn8vzG0SQPfIX2yO51JsHBYCRbouTatZ4hMdlmF67UcNiAgwke9jr';
$tok = '';
if ($n < 0 || $n >= 62**4) return ''; # Value out of range
$n = lcg($n);
for ($i=0; $i<4; $i++) {
$r = $n % 62;
$tok .= $alf[$r];
$n = ($n - $r) / 62;
}
return $tok;
}
function token_2_int($tok) {
$t = [ '0'=>15, '1'=>26, '2'=>22, '3'=>1, '4'=>41, '5'=>25, '6'=>48, '7'=>49,
'8'=>11, '9'=>59, 'A'=>54, 'B'=>30, 'C'=>32, 'D'=>8, 'E'=>6, 'F'=>47,
'G'=>14, 'H'=>29, 'I'=>20, 'J'=>27, 'K'=>7, 'L'=>4, 'M'=>43, 'N'=>52,
'O'=>24, 'P'=>18, 'Q'=>17, 'R'=>33, 'S'=>16, 'T'=>37, 'U'=>50, 'V'=>3,
'W'=>0, 'X'=>21, 'Y'=>31, 'Z'=>40, 'a'=>38, 'b'=>34, 'c'=>51, 'd'=>44,
'e'=>58, 'f'=>19, 'g'=>55, 'h'=>42, 'i'=>53, 'j'=>60, 'k'=>57, 'l'=>45,
'm'=>46, 'n'=>10, 'o'=>35, 'p'=>5, 'q'=>2, 'r'=>61, 's'=>28, 't'=>39,
'u'=>36, 'v'=>12, 'w'=>56, 'x'=>9, 'y'=>23, 'z'=>13 ];
$n = 0;
if (!preg_match('/^[a-z0-9]{4}$/i', $tok)) return -1; # Invalid token
for ($i=3; $i>=0; --$i) {
$n = $n * 62 + $t[$tok[$i]];
}
return lcg_inv($n);
}
So when you get a new URL to shorten, insert it into your database with an auto-incrementing ID value, and pass this ID value to int_2_token() to obtain a four-character token to use in the shortened URL. When this shortened URL is requested, pass the token to token_2_int() to recover this ID so you can fetch the original URL.
Note: Don't forget that the set of all four-character tokens includes the entire set of all four-letter words. You will probably want to make sure that your URL shortener doesn't output anything vulgar or offensive.