Random Algorithm with adjustable probability

542 Views Asked by At

I'm searching for an algorithm (no matter what programming language, maybe Pseudo-code?) where you get a random number with different probability's.

For example:

A random Generator, which simulates a dice where the chance for a '6' is 50% and for the other 5 numbers it's 10%.

The algorithm should be scalable, because this is my exact problem:

I have a array (or database) of elements, from which i want to select 1 random element. But each element should have a different probability to be selected. So my idea is that every element get a number. And this number divided by the sum of all numbers results the chance for the number to be randomly selected.

Anybody know a good programming language (or library) for this problem? The best solution would be a good SQL Query which delivers 1 random entry. But i would also be happy with every hint or attempt in an other programming language.

2

There are 2 best solutions below

0
On BEST ANSWER

A simple algorithm to achieve it is:

  1. Create an auexillary array where sum[i] = p1 + p2 + ... + pi. This is done only once.
  2. When you draw a number, draw a number r with uniform distribution over [0,sum[n]), and binary search for the first number higher than the uniformly distributed random number. It can be done using binary search efficiently.

It is easy to see that indeed the probability for r to lay in a certain range [sum[i-1],sum[i]), is indeed sum[i]-sum[i-1] = pi
(In the above, we regard sum[-1]=0, for completeness)


For your cube example:

You have:

p1=p2=....=p5 = 0.1
p6 = 0.5

First, calculate sum array:

sum[1] = 0.1
sum[2] = 0.2
sum[3] = 0.3
sum[4] = 0.4
sum[5] = 0.5
sum[6] = 1

Then, each time you need to draw a number: Draw a random number r in [0,1), and choose the number closest to it, for example:

r1 = 0.45 -> element = 4
r2 = 0.8 -> element = 6
r3 = 0.1 -> element = 2
r4 = 0.09 -> element = 1
1
On

An alternative answer. Your example was in percentages, so set up an array with 100 slots. A 6 is 50%, so put 6 in 50 of the slots. 1 to 5 are at 10% each, so put 1 in 10 slots, 2 in 10 slots etc. until you have filled all 100 slots in the array. Now pick one of the slots at random using a uniform distribution in [0, 99] or [1, 100] depending on the language you are using.

The contents of the selected array slot will give you the distribution you want.

ETA: On second thoughts, you don't actually need the array, just use cumulative probabilities to emulate the array:

r = rand(100) // In range 0 -> 99 inclusive.

if (r < 50) return 6;  // Up to 50% returns a 6.
if (r < 60) return 1;  // Between 50% and 60% returns a 1.
if (r < 70) return 2;  // Between 60% and 70% returns a 2.
etc.

You already know what numbers are in what slots, so just use cumulative probabilities to pick a virtual slot: 50; 50 + 10; 50 + 10 + 10; ...

Be careful of edge cases and whether your RNG is 0 -> 99 or 1 -> 100.