Statistics
A statistics utility module was put together for this project to support the need for sampling and weighted choice mechanisms. It can be found in the utilitis/statistics folder.
Weighted Sampler
There are two weighted samplers implemented in this work: weightedSamplerReplacement, weightedSamplerNoReplacement. As the names would suggest in the first case items in the list may be sampled more than once, in the second items once sampled are no longer available for sampling.
The weighted samplers take:
dec array of weights
int sampleSize
The list of weights should match the indexes for the list you are sampling from. The weighted sampler will return a list of indexes of the sampled items. This allows it to work with any kind of array.
Uniform Sampler
There are two uniform samplers implemented in this work: uniformSamplerReplacement, uniformSamplerNoReplacement. As the names would suggest in the first case items in the list may be sampled more than once, in the second items once sampled are no longer available for sampling.
The uniform sampler takes:
int length
int sampleSize
The sampler randomly selects indexes with in the length of the array to return a list of sampled indexes. This allows the sampler to work with any array type.
Sum
A simple function which takes a list of decimals and returns the sum of to list.
Distributions
Statistics also contains the Distribution data type. A Distribution has a list of values of any data type, a list of decimal proportions for each value and an int length for the number of unique values in the distribution.
This is formed using the makeDistribution function which takes a hashTable of counts of each value and returns a distribution.