Content
Randomization
Randomization
Random number generation
Randomize a series
Randomization of intervention-control pairs
Random allocation to two independent groups
Block randomization
Preference allocation
Random numbers
Randomization. 1
Random number generation. 2
Randomize a series 2
Randomization of intervention-control pairs 4
Random allocation to two independent groups 6
Block randomization. 7
Preference allocation. 8
Random number generation. 10
Copyright © 1990-2006 StatsDirectLimited, all rights reserved
Download a free 10 day StatsDirect trial
Menu location: Analysis_Miscellaneous_Randomization.
This section provides randomallocations for randomized study designs:
·Seriesx to y
·Intervention-controlpairs
·Two independent groups
·Blockrandomization to k treatments
·Preferenceallocation (menu item only shows with workbook open)
A good quality pseudo-randomnumber generator is used to randomize series of numbers for each of the typesof allocation. The random number generator is reseeded each time it is used,therefore, there is extremely little risk using the same (pseudo-)random numberseries for different randomizations unless you specify the same seed for therandom number generator. For technical information on the random numbergenerator used here please see random numbergenerator.
Another section of StatsDirect generates random deviates from differentprobability distributions (uniform, normal, gamma etc.), see randomnumbers.
Copyright © 1990-2006 StatsDirectLimited, all rights reserved
Download a free 10 day StatsDirect trial
Random number generation by wellresearched algorithms should be able to provide extremely long series ofnumbers for which there is an infinitesimally small probability of finding arepeating pattern. StatsDirect uses such an algorithm(see below).
If you want to get down to basicsyou might ask, what is random?. A lecture theatrefilled with Mathematicians, Philosophers and Elemental Physicists would love todebate this, enough said. Rather than getting suck in debate over what israndom, a practical approach is to look for evidence of non-randomness such asrepeated patterns. Various methods have been employed to look fornon-randomness from "random" number generators since they began toemerge in the 1960s. Most "quick and dirty" random number generators,including those supplied with computer language compilers, use over simplemethods which produce sequences of numbers with repeating patterns, they areunacceptable for statistical use.
Technical validation
StatsDirect uses the Mersenne Twister algorithm of Matsumoto andNisimura (1998), with updates to its initialisationsuggested by the authors via their website in February 2002. Prior to version 2.2.0,StatsDirect used the RANROT type W and Mother-of-Allalgorithm described by Agner Fog (2000).Both algorithms pass all of the all of the DIEHARDtests (Marsaglia,1997) and perform well in the theoretical spectral tests (Knuth, 1998).The Mersenne Twister, however, has a strongertheoretical basis for the uniformity of its output, and is well studied. The Mersenne Twister generator has a resolution of 32-bits anda period of 2^19937 - 1. For more information on the strengths and weaknessesof random number generation by computer see Marsaglia (1993,1996).
Seeds
Most random number generatorsrequire a seed number. If the generator is given the same seed each time it iscalled then it will produce the same series of numbers. This is not acceptablefor many purposes, therefore, StatsDirectseeds the random number generator with a number taken from the computer's clock(the number of hundredths of a second which have elapsed since midnight). It ishighly improbable that StatsDirect will produce thesame "random" sequence more than once, thetime is stamped on randomization output so that you can validate this. Therandom number generation section of the data menu enables you to specify seeds.
Copyright © 1990-2006 StatsDirectLimited, all rights reserved
Download a free 10 day StatsDirect trial
This function randomizes a seriesof integers for which you define the beginning and end points of the series.
For example, randomizing numbersfrom 6 to 10 is like shuffling 5 cards labelled 6 to10. To randomize a series of 49 possible lottery numbers enter x as 1 and y as49:
Random allocation of numbers in aseries
Randomized with seed: 10 | |
1 | 42 |
2 | 49 |
3 | 16 |
4 | 13 |
5 | 41 |
6 | 1 |
7 | 19 |
8 | 7 |
9 | 15 |
10 | 40 |
11 | 11 |
12 | 6 |
13 | 47 |
14 | 46 |
15 | 45 |
16 | 2 |
17 | 23 |
18 | 36 |
19 | 5 |
20 | 27 |
21 | 21 |
22 | 43 |
23 | 37 |
24 | 18 |
25 | 25 |
26 | 34 |
27 | 9 |
28 | 14 |
29 | 17 |
30 | 22 |
31 | 35 |
32 | 38 |
33 | 32 |
34 | 29 |
35 | 33 |
36 | 31 |
37 | 26 |
38 | 4 |
39 | 28 |
40 | 30 |
41 | 10 |
42 | 44 |
43 | 8 |
44 | 39 |
45 | 12 |
46 | 48 |
47 | 3 |
48 | 24 |
49 | 20 |
You could then take the first,say six, as your random selection.
Technical validation
Robust (pseudo-)random numbergeneration is used, see random numbergeneration.
Copyright © 1990-2006 StatsDirectLimited, all rights reserved
Download a free 10 day StatsDirect trial
Thisfunction provides random allocation into intervention or control arms of atrial for a paired/matched experimental design. Each subject will experienceboth intervention and control treatment at some stage during the trial, thepaired randomization determines the treatment order.
Randomization reducesopportunities for biasand confoundingin experimental designs, and leads to treatment groups which are random samplesof the population sampled, thus helping to meet assumptions of subsequentstatistical analysis (Bland, 2000). Aparticular bias in this design might be intervention having some carry-over effect on controltreatment.
Example: randomization of 50patients into treatment (intervention) and placebo (control) arms of a trial ofa new drug. This would give 50 pairs of INTERVENTION - CONTROL or CONTROL -INTERVENTION. If this was a randomized crossover study then you would give drugfirst if the order was INTERVENTION - CONTROL and you would give placebo firstif the order was CONTROL - INTERVENTION:
Randomized intervention-controlpairs
Randomized with seed: 10,balanced allocation
1 | Intervention - Control |
2 | Control - Intervention |
3 | Intervention - Control |
4 | Intervention - Control |
5 | Control - Intervention |
6 | Control - Intervention |
7 | Intervention - Control |
8 | Control - Intervention |
9 | Intervention - Control |
10 | Control - Intervention |
11 | Intervention - Control |
12 | Control - Intervention |
13 | Control - Intervention |
14 | Intervention - Control |
15 | Intervention - Control |
16 | Intervention - Control |
17 | Control - Intervention |
18 | Control - Intervention |
19 | Intervention - Control |
20 | Control - Intervention |
21 | Intervention - Control |
22 | Control - Intervention |
23 | Control - Intervention |
24 | Intervention - Control |
25 | Control - Intervention |
26 | Control - Intervention |
27 | Control - Intervention |
28 | Intervention - Control |
29 | Intervention - Control |
30 | Intervention - Control |
31 | Intervention - Control |
32 | Intervention - Control |
33 | Intervention - Control |
34 | Control - Intervention |
35 | Control - Intervention |
36 | Intervention - Control |
37 | Intervention - Control |
38 | Control - Intervention |
39 | Control - Intervention |
40 | Control - Intervention |
41 | Control - Intervention |
42 | Intervention - Control |
43 | Control - Intervention |
44 | Control - Intervention |
45 | Intervention - Control |
46 | Intervention - Control |
47 | Control - Intervention |
48 | Control - Intervention |
49 | Intervention - Control |
50 | Intervention - Control |
Technical validation
Robust (pseudo-)random numbergeneration is used, see random numbergeneration.
Copyright © 1990-2006 StatsDirectLimited, all rights reserved
Download a free 10 day StatsDirect trial
This function allocates a givennumber of subjects at random to one of two independent groups.
Randomization reducesopportunities for biasand confoundingin experimental designs, and leads to treatment groups which are random samplesof the population sampled, thus helping to meet assumptions of subsequentstatistical analysis (Bland, 2000).
Two independent groups might beintervention and control groups, for example to examine the effect of a newtreatment. For a randomized controlled trial of a new treatment you wouldrandomly allocate some subjects to receive the new treatment and the othersubjects to receive the control treatment (e.g. placebo drug). For a total of30 subjects in two groups of 15 you would enter 30 into this function:
Unpaired random allocation tointervention or control group
Randomized with seed: 10
Intervention | 1 | Control | 2 |
Intervention | 3 | Control | 6 |
Intervention | 4 | Control | 8 |
Intervention | 5 | Control | 12 |
Intervention | 7 | Control | 13 |
Intervention | 9 | Control | 16 |
Intervention | 10 | Control | 17 |
Intervention | 11 | Control | 20 |
Intervention | 14 | Control | 21 |
Intervention | 15 | Control | 23 |
Intervention | 18 | Control | 24 |
Intervention | 19 | Control | 25 |
Intervention | 22 | Control | 27 |
Intervention | 26 | Control | 28 |
Intervention | 30 | Control | 29 |
- herethe first subject would be allocated to the control group and the second to thetreatment group etc.
Technical validation
Robust (pseudo-)random numbergeneration is used, see random numbergeneration.
Copyright © 1990-2006 StatsDirectLimited, all rights reserved
Download a free 10 day StatsDirect trial
This function randomizes nindividuals into k treatments, in blocks of size m.
Randomization reducesopportunities for biasand confoundingin experimental designs, and leads to treatment groups which are random samplesof the population sampled, thus helping to meet assumptions of subsequentstatistical analysis (Bland, 2000).
Random allocation can be made inblocks in order to keep the sizes of treatment groups similar. In order to dothis you must specify a sample size that is divisible by the block size youchoose. In turn you must choose a block size that is divisible by the number oftreatment groups you specify.
An advantage of small block sizesis that treatment group sizes are very similar. A disadvantage of small blocksizes is that it is possible to guess some allocations, thus reducing blindingin the trial. An alternative to using large block sizes is to use randomsequences of block sizes, which can be done in StatsDirectby specifying 医.学全在线a block size of zero. The random block size option selects blocksizes of 2, 3, 4 or 5 at random.
The randomization proceeds byallocating random permutations of treatments within each block.
Random allocation in blocks
Randomized with seed: 10
Subjects: 20
Block size: 4
Treatments: 2
Subject | Treatment |
1 | 2 |
2 | 1 |
3 | 1 |
4 | 2 |
5 | 1 |
6 | 2 |
7 | 2 |
8 | 1 |
9 | 1 |
10 | 2 |
11 | 2 |
12 | 1 |
13 | 2 |
14 | 1 |
15 | 1 |
16 | 2 |
17 | 1 |
18 | 2 |
19 | 2 |
20 | 1 |
Technical validation
Robust (pseudo-)random numbergeneration is used, see random numbergeneration.
Copyright © 1990-2006 StatsDirectLimited, all rights reserved
Download a free 10 day StatsDirect trial
Menu location: Analysis_Miscellaneous_Randomization_PreferenceAllocation.
This function allocates subjectsto groups according to their preferences. A uniform randomallocation procedure is used to select subjects for inclusion in groups whichare over-subscribed. The procedure is best explained by example:
Suppose ten students were askedto apply for a choice of four courses. The first three courses have a capacityof three and the fourth can accommodate five students if necessary. Thestudents are asked to list their top three course preferences in order.
StatsDirect can allocate the students to a course based on their preferencesand on a uniform random selection procedure for over-subscribed courses. Theiris no weighting procedure for any round of selections as this would encouragetactical preference choice, i.e. the probability that a student is allocatedhis/her first preference is not influenced by the subscription rates forhis/her other preferences.
Say our ten students mark thefollowing preferences for courses 1 to 5:
Student | Preference 1 | Preference 2 | Preference 3 |
1 | 3 | 5 | 1 |
2 | 3 | 4 | 2 |
3 | 5 | 1 | 3 |
4 | 3 | 1 | 4 |
5 | 5 | 3 | 1 |
6 | 5 | 4 | 3 |
7 | 2 | 3 | 5 |
8 | 4 | 1 | 5 |
9 | 3 | 5 | 1 |
10 | 1 | 3 | 5 |
The maximum capacity of eachgroup is as follows:
Grouwww.lindalemus.comp | Capacity/Places |
1 | 2 |
2 | 1 |
3 | 2 |
4 | 3 |
5 | 5 |
To use StatsDirectto allocate the students to groups you must first enter the above columns ofdata into a workbook. Then select preference allocation from the randomizationsection of the analysis menu. When asked for columns of preferences you mustselect the columns in the correct order, i.e. preference 1, 2, 3. Then selectthe group capacity column, in this column the rows represent the allocationgroups to which the preference data refer (i.e. if the entry in row 3 of thecapacity column was 5 this would mean that group 3 can hold a maximum of 5 subjects).For this example the random allocation procedure yielded the results below. Ifyou run this example more than once you are likely to get different resultseach time as the random number generator is re-seeded for each run.
An example output is:
Random allocation to groups bypreference
Groups = 5
Total group capacity = 13
Subjects = 10
Randomized with seed: 10
Subject | Group |
1 | 3 |
2 | 4 |
3 | 5 |
4 | 3 |
5 | 5 |
6 | 5 |
7 | 2 |
8 | 4 |
9 | 5 |
10 | 1 |
Copyright © 1990-2006 StatsDirectLimited, all rights reserved
Download a free 10 day StatsDirect trial
Menu location: Data_Random Numbers.
This function enables you tocreate one or more series of random numbers from given distributions.
A robust generatorof uniform (pseudo)random numbers is used as the basis for generating deviatesfrom the probability distributions described below. You are given theopportunity to enter your own seed number to be used by the random numbergenerator but you should use the default seed (based upon your computer'sclock) in most cases. Please note that each seed generates its own series andthat series is the same if you use the seed again. You have very little chanceof using the same seed twice if you select the seed that StatsDirectsuggests.
These functions are intended forsimulation work; they employ widely cited and debated algorithms (Gentle, 2003).
Uniform (continuousuniform, rectangular)
The continuous uniformdistribution has a constant density function of the interval (a, b) and thus arectangular shape from a to b:
- wherea and b lie between minus and plus infinity.
The interval most commonly usedis 0 to 1
Normal (Gaussian)
The commonly used standard normaldistribution (mean of 0 and standard deviation of 1) is one of a family ofnormal distributions defined by the density function:
- wheremean m lies between minus and plus infinity and standard deviation s is greater thanzero.
Algorithm: inversion of thecumulative distribution function (Wichura, 1988;Gentle 2003)
Lognormal
The density of a lognormaldistribution is given by:
- wheremean m lies between minus and plus infinity and standard deviation s is greater thanzero.
The mean of the lognormaldistribution, as opposed to the mean of the underlying normal distribution, isequal to exp(m+s*s/2) and the variance is equal to exp(2m+2s*s)-exp(2m+ s*s).
Algorithm: transformed inversionof the cumulative distribution function (Wichura, 1988;Gentle 2003)
Exponential
The (negative) exponentialdistribution is a special case (shaping parameter of 1) of the gammadistribution. Its density is given by:
- wherethe parameter l must be greater than zero.
Algorithm: transformation (Ahrens & Dieter,1972).
Gamma
The density function of the gammadistribution is given by:
- wherethe parameters l and r (shaping parameter) must be greater than zero.
G(*) is the gamma function:
,x>0
Please note that gamma deviateswith a shaping parameter of 0.5 are half the square of normal deviates andgamma deviates with a shaping parameter of 1 are exponential deviates.
Algorithm: acceptance-rejectionmethods GD and GS (Ahrens& Dieter, 1974, 1982b).
Binomial
The density function of the binomialdistribution is given by:
- wherep lies between 0 and 1 in nranges.
Algorithm: acceptance-rejectionmethod BTPEC (Kachitvichyanukul& Schmeiser, 1988).
Poisson
The density function of the Poissondistribution is given by:
- whereparameter l is greater than zero.
Algorithm: acceptance-rejection (Ahrens & Dieter,1982a).
Chi-square
The density function of the chi-squaredistribution is given by:
The deviates are calculated asgamma deviates with parameters n/2 and 2, where n isdegrees of freedom.
Algorithm: Transformed gammadeviates (Ahrens& Dieter, 1974,1982b; Gentle 2003).
F (variance ratio)
The density function of the F distribution is given by:
The deviates are calculated as nx/dz where n is the numerator degrees of freedom, d is thedenominator degrees of freedom, x is a gamma deviate with parameters n/2 and 2(chi-square with n degrees of freedom), and z is a gamma deviate withparameters d/2 and 2 (chi-square with d degrees of freedom).
Algorithm: Transformed gammadeviates (Ahrens& Dieter, 1974, 1982b; Gentle 2003).
Student's t
The density function of Student's tdistribution is given by:
The deviates are calculated as astandard normal deviate multiplied by the square root of the degrees of freedom(n) divided by a gamma deviate with parameters n/2 and 2 (a chi-square deviatewith n degrees of freedom).
Algorithm: Transformed standardnormal and gamma deviates (Ahrens & Dieter,1974, 1982b; Gentle 2003).
Beta
The density function of the betadistribution is given by:
Algorithm: Acceptance-rejectionmethods BB and BC (Cheng1978).
Logistic
The density function of the betadistribution is given by:
Algorithm: Transformed uniformdeviates (Gentle2003).
Cauchy
Algorithm: Transformed uniformdeviates (Gentle2003).
Weibull
Algorithm: Transformed uniformdeviates (Gentle2003).
Geometric
Algorithm: Transformed Poissonand exponential deviates (Devroye, 1986;Gentle 2003; Ahrens & Dieter, 1972, 1982a).
Negative binomial
Algorithm: Transformed Poissonand gamma deviates (Devroye, 1986;Gentle 2003; Ahrens & Dieter, 1974, 1982b, 1982a).