I found the following question today that piqued my interest:
Given an array of items where each item has a name and a weight (integer value), select a random item from the array based on the weight.
What this means is that items in the array should be randomly chosen such that the larger the weight, the more likely the item will be picked.
To approach this problem, I will create an array where each element gives us the probability the ith item would be chosen. To build this array, I will first find the total amount of weight amongst all items and divided each item’s weight by this total value. Once I have this array, I will be able to pick a random number between 0.0 (Inclusive) and 1.0 (Exclusive) to determine which item to pick.
My language of choice for this example will be Python. Let’s say I am given the data as an ordered list of Tuples where the first element is the name of the item and the second element is the weight of that item.
items = [('Item1', 4), ('Item2', 3), ('Item3', 9), ('Item4', 7)]
With this array of items, I will first grab the weights and store it in an array such that the ith element in the this array corresponds to the ith element in the items array.
weights = map(lambda item: item, items) # [4, 3, 9, 2]
Once I have this array of weights, it’s trivial to find the total weight for all items.
total = float(sum(weights)) # 23
Note that I’m casting the sum to a float. I will need this total as a float to keep precision for when I divide each weight by this value. If we left the total as an integer, the divide operation for each weight will return 0 since an integer divided by an integer is an integer.
probabilities = map(lambda weight: weight / total, weights) # [0.1739, 0.1304, 0.3913, 0.304] - Values shortened for brevity
Since these are probabilities, these values added together should equal 1.0. Visually, think each of each entry as a slice of a pie. The higher the probability value, the larger the piece of pie is. We now need to determine which slice of pie we get, if we randomly select a piece.
To do this, we can use the Python function random.random(), which will return a value between 0.0 and 1.0. To pick the correct value, I will iterate through this list of normalized weights and add the current item to a total. If the random value is less than the total, then we have our random item.
random_val = random.random() total_weight = 0 for i, weight in enumerate(probabilities): total_weight += weight if random_val < total_weight: return items[i]
Let’s step through our example above to get a better understanding of how this all works. Let’s say our random value is 0.6195. During the first iteration, total_weight is set to to 0.1739. Our random value is greater than this value, so we continue onto the next iteration. During the second iteration total_weight is set to 0.3043. Again, the random value is greater than this value, so we continue onto the 3rd iteration where we set total_weight to 0.6956. Finally, the random value is less than total_weight, so the third item is our random value.
Overall this algorithm runs in O(n) time. If we know that the values are already sorted, we can instead use a binary search to get this down to O(lgn) time. However, sorting will take O(nlgn) operations, so an O(n) solution will suffice for now.