医学统计学-电子教材:Randomization

来源：南方医科大学精品课程网精品课程网

医学统计学:电子教材 Randomization:ContentRandomizationRandomizationRandomnumbergenerationRandomizeaseriesRandomizationofintervention-controlpairsRandomallocationtotwoindependentgroupsBlockrandomizationPreferenceallocationRandomnumbers

Content

Book Randomization

Page Randomization

Page Random number generation

Page Randomize a series

Page Randomization of intervention-control pairs

Page Random allocation to two independent groups

Page Block randomization

Page Preference allocation

Page Random numbers

Randomization. 1

Random number generation. 2

Randomize a series 2

Randomization of intervention-control pairs 4

Random allocation to two independent groups 6

Block randomization. 7

Preference allocation. 8

Random number generation. 10

Download a free 10 day StatsDirect trial

Randomization

Menu location: Analysis_Miscellaneous_Randomization.

This section provides randomallocations for randomized study designs:

·Seriesx to y

·Intervention-controlpairs

·Two independent groups

·Blockrandomization to k treatments

·Preferenceallocation (menu item only shows with workbook open)

A good quality pseudo-randomnumber generator is used to randomize series of numbers for each of the typesof allocation. The random number generator is reseeded each time it is used,therefore, there is extremely little risk using the same (pseudo-)random numberseries for different randomizations unless you specify the same seed for therandom number generator. For technical information on the random numbergenerator used here please see random numbergenerator.

Another section of StatsDirect generates random deviates from differentprobability distributions (uniform, normal, gamma etc.), see randomnumbers.

Download a free 10 day StatsDirect trial

Randomnumber generation

Random number generation by wellresearched algorithms should be able to provide extremely long series ofnumbers for which there is an infinitesimally small probability of finding arepeating pattern. StatsDirect uses such an algorithm(see below).

If you want to get down to basicsyou might ask, what is random?. A lecture theatrefilled with Mathematicians, Philosophers and Elemental Physicists would love todebate this, enough said. Rather than getting suck in debate over what israndom, a practical approach is to look for evidence of non-randomness such asrepeated patterns. Various methods have been employed to look fornon-randomness from "random" number generators since they began toemerge in the 1960s. Most "quick and dirty" random number generators,including those supplied with computer language compilers, use over simplemethods which produce sequences of numbers with repeating patterns, they areunacceptable for statistical use.

Technical validation

StatsDirect uses the Mersenne Twister algorithm of Matsumoto andNisimura (1998), with updates to its initialisationsuggested by the authors via their website in February 2002. Prior to version 2.2.0,StatsDirect used the RANROT type W and Mother-of-Allalgorithm described by Agner Fog (2000).Both algorithms pass all of the all of the DIEHARDtests (Marsaglia,1997) and perform well in the theoretical spectral tests (Knuth, 1998).The Mersenne Twister, however, has a strongertheoretical basis for the uniformity of its output, and is well studied. The Mersenne Twister generator has a resolution of 32-bits anda period of 2^19937 - 1. For more information on the strengths and weaknessesof random number generation by computer see Marsaglia (1993,1996).

Seeds

Most random number generatorsrequire a seed number. If the generator is given the same seed each time it iscalled then it will produce the same series of numbers. This is not acceptablefor many purposes, therefore, StatsDirectseeds the random number generator with a number taken from the computer's clock(the number of hundredths of a second which have elapsed since midnight). It ishighly improbable that StatsDirect will produce thesame "random" sequence more than once, thetime is stamped on randomization output so that you can validate this. Therandom number generation section of the data menu enables you to specify seeds.

Download a free 10 day StatsDirect trial

Randomize aseries

This function randomizes a seriesof integers for which you define the beginning and end points of the series.

For example, randomizing numbersfrom 6 to 10 is like shuffling 5 cards labelled 6 to10. To randomize a series of 49 possible lottery numbers enter x as 1 and y as49:

Random allocation of numbers in aseries

Randomized with seed: 10
1	42
2	49
3	16
4	13
5	41
6	1
7	19
8	7
9	15
10	40
11	11
12	6
13	47
14	46
15	45
16	2
17	23
18	36
19	5
20	27
21	21
22	43
23	37
24	18
25	25
26	34
27	9
28	14
29	17
30	22
31	35
32	38
33	32
34	29
35	33
36	31
37	26
38	4
39	28
40	30
41	10
42	44
43	8
44	39
45	12
46	48
47	3
48	24
49	20

You could then take the first,say six, as your random selection.

Technical validation

Robust (pseudo-)random numbergeneration is used, see random numbergeneration.

Download a free 10 day StatsDirect trial

Randomizationof intervention-control pairs

Thisfunction provides random allocation into intervention or control arms of atrial for a paired/matched experimental design. Each subject will experienceboth intervention and control treatment at some stage during the trial, thepaired randomization determines the treatment order.

Randomization reducesopportunities for biasand confoundingin experimental designs, and leads to treatment groups which are random samplesof the population sampled, thus helping to meet assumptions of subsequentstatistical analysis (Bland, 2000). Aparticular bias in this design might be intervention having some carry-over effect on controltreatment.

Example: randomization of 50patients into treatment (intervention) and placebo (control) arms of a trial ofa new drug. This would give 50 pairs of INTERVENTION - CONTROL or CONTROL -INTERVENTION. If this was a randomized crossover study then you would give drugfirst if the order was INTERVENTION - CONTROL and you would give placebo firstif the order was CONTROL - INTERVENTION:

Randomized intervention-controlpairs

Randomized with seed: 10,balanced allocation

1	Intervention - Control
2	Control - Intervention
3	Intervention - Control
4	Intervention - Control
5	Control - Intervention
6	Control - Intervention
7	Intervention - Control
8	Control - Intervention
9	Intervention - Control
10	Control - Intervention
11	Intervention - Control
12	Control - Intervention
13	Control - Intervention
14	Intervention - Control
15	Intervention - Control
16	Intervention - Control
17	Control - Intervention
18	Control - Intervention
19	Intervention - Control
20	Control - Intervention
21	Intervention - Control
22	Control - Intervention
23	Control - Intervention
24	Intervention - Control
25	Control - Intervention
26	Control - Intervention
27	Control - Intervention
28	Intervention - Control
29	Intervention - Control
30	Intervention - Control
31	Intervention - Control
32	Intervention - Control
33	Intervention - Control
34	Control - Intervention
35	Control - Intervention
36	Intervention - Control
37	Intervention - Control
38	Control - Intervention
39	Control - Intervention
40	Control - Intervention
41	Control - Intervention
42	Intervention - Control
43	Control - Intervention
44	Control - Intervention
45	Intervention - Control
46	Intervention - Control
47	Control - Intervention
48	Control - Intervention
49	Intervention - Control
50	Intervention - Control

Technical validation

Robust (pseudo-)random numbergeneration is used, see random numbergeneration.

Download a free 10 day StatsDirect trial

Randomallocation to two independent groups

This function allocates a givennumber of subjects at random to one of two independent groups.

Two independent groups might beintervention and control groups, for example to examine the effect of a newtreatment. For a randomized controlled trial of a new treatment you wouldrandomly allocate some subjects to receive the new treatment and the othersubjects to receive the control treatment (e.g. placebo drug). For a total of30 subjects in two groups of 15 you would enter 30 into this function:

Unpaired random allocation tointervention or control group

Randomized with seed: 10

Intervention	1	Control	2
Intervention	3	Control	6
Intervention	4	Control	8
Intervention	5	Control	12
Intervention	7	Control	13
Intervention	9	Control	16
Intervention	10	Control	17
Intervention	11	Control	20
Intervention	14	Control	21
Intervention	15	Control	23
Intervention	18	Control	24
Intervention	19	Control	25
Intervention	22	Control	27
Intervention	26	Control	28
Intervention	30	Control	29

- herethe first subject would be allocated to the control group and the second to thetreatment group etc.

Technical validation

Robust (pseudo-)random numbergeneration is used, see random numbergeneration.

Download a free 10 day StatsDirect trial

Blockrandomization

This function randomizes nindividuals into k treatments, in blocks of size m.

Random allocation can be made inblocks in order to keep the sizes of treatment groups similar. In order to dothis you must specify a sample size that is divisible by the block size youchoose. In turn you must choose a block size that is divisible by the number oftreatment groups you specify.

An advantage of small block sizesis that treatment group sizes are very similar. A disadvantage of small blocksizes is that it is possible to guess some allocations, thus reducing blindingin the trial. An alternative to using large block sizes is to use randomsequences of block sizes, which can be done in StatsDirectby specifying 医.学全在线a block size of zero. The random block size option selects blocksizes of 2, 3, 4 or 5 at random.

The randomization proceeds byallocating random permutations of treatments within each block.

Random allocation in blocks

Randomized with seed: 10

Subjects: 20

Block size: 4

Treatments: 2

Subject	Treatment
1	2
2	1
3	1
4	2
5	1
6	2
7	2
8	1
9	1
10	2
11	2
12	1
13	2
14	1
15	1
16	2
17	1
18	2
19	2
20	1

Technical validation

Robust (pseudo-)random numbergeneration is used, see random numbergeneration.

Download a free 10 day StatsDirect trial

Preferenceallocation

Menu location: Analysis_Miscellaneous_Randomization_PreferenceAllocation.

This function allocates subjectsto groups according to their preferences. A uniform randomallocation procedure is used to select subjects for inclusion in groups whichare over-subscribed. The procedure is best explained by example:

Suppose ten students were askedto apply for a choice of four courses. The first three courses have a capacityof three and the fourth can accommodate five students if necessary. Thestudents are asked to list their top three course preferences in order.

StatsDirect can allocate the students to a course based on their preferencesand on a uniform random selection procedure for over-subscribed courses. Theiris no weighting procedure for any round of selections as this would encouragetactical preference choice, i.e. the probability that a student is allocatedhis/her first preference is not influenced by the subscription rates forhis/her other preferences.

Say our ten students mark thefollowing preferences for courses 1 to 5:

Student	Preference 1	Preference 2	Preference 3
1	3	5	1
2	3	4	2
3	5	1	3
4	3	1	4
5	5	3	1
6	5	4	3
7	2	3	5
8	4	1	5
9	3	5	1
10	1	3	5

The maximum capacity of eachgroup is as follows:

Grouwww.lindalemus.comp	Capacity/Places
1	2
2	1
3	2
4	3
5	5

To use StatsDirectto allocate the students to groups you must first enter the above columns ofdata into a workbook. Then select preference allocation from the randomizationsection of the analysis menu. When asked for columns of preferences you mustselect the columns in the correct order, i.e. preference 1, 2, 3. Then selectthe group capacity column, in this column the rows represent the allocationgroups to which the preference data refer (i.e. if the entry in row 3 of thecapacity column was 5 this would mean that group 3 can hold a maximum of 5 subjects).For this example the random allocation procedure yielded the results below. Ifyou run this example more than once you are likely to get different resultseach time as the random number generator is re-seeded for each run.

An example output is:

Random allocation to groups bypreference

Groups = 5

Total group capacity = 13

Subjects = 10

Randomized with seed: 10

Subject	Group
1	3
2	4
3	5
4	3
5	5
6	5
7	2
8	4
9	5
10	1

Download a free 10 day StatsDirect trial

Randomnumber generation

Menu location: Data_Random Numbers.

This function enables you tocreate one or more series of random numbers from given distributions.

A robust generatorof uniform (pseudo)random numbers is used as the basis for generating deviatesfrom the probability distributions described below. You are given theopportunity to enter your own seed number to be used by the random numbergenerator but you should use the default seed (based upon your computer'sclock) in most cases. Please note that each seed generates its own series andthat series is the same if you use the seed again. You have very little chanceof using the same seed twice if you select the seed that StatsDirectsuggests.

These functions are intended forsimulation work; they employ widely cited and debated algorithms (Gentle, 2003).

Uniform (continuousuniform, rectangular)

The continuous uniformdistribution has a constant density function of the interval (a, b) and thus arectangular shape from a to b:

- wherea and b lie between minus and plus infinity.

The interval most commonly usedis 0 to 1

Normal (Gaussian)

The commonly used standard normaldistribution (mean of 0 and standard deviation of 1) is one of a family ofnormal distributions defined by the density function:

- wheremean m lies between minus and plus infinity and standard deviation s is greater thanzero.

Algorithm: inversion of thecumulative distribution function (Wichura, 1988;Gentle 2003)

Lognormal

The density of a lognormaldistribution is given by:

- wheremean m lies between minus and plus infinity and standard deviation s is greater thanzero.

The mean of the lognormaldistribution, as opposed to the mean of the underlying normal distribution, isequal to exp(m+s*s/2) and the variance is equal to exp(2m+2s*s)-exp(2m+ s*s).

Algorithm: transformed inversionof the cumulative distribution function (Wichura, 1988;Gentle 2003)

Exponential

The (negative) exponentialdistribution is a special case (shaping parameter of 1) of the gammadistribution. Its density is given by:

- wherethe parameter l must be greater than zero.

Algorithm: transformation (Ahrens & Dieter,1972).

Gamma

The density function of the gammadistribution is given by:

- wherethe parameters l and r (shaping parameter) must be greater than zero.

G(*) is the gamma function:

,x>0

Please note that gamma deviateswith a shaping parameter of 0.5 are half the square of normal deviates andgamma deviates with a shaping parameter of 1 are exponential deviates.

Algorithm: acceptance-rejectionmethods GD and GS (Ahrens& Dieter, 1974, 1982b).

Binomial

The density function of the binomialdistribution is given by:

- wherep lies between 0 and 1 in nranges.

Algorithm: acceptance-rejectionmethod BTPEC (Kachitvichyanukul& Schmeiser, 1988).

Poisson

The density function of the Poissondistribution is given by:

- whereparameter l is greater than zero.

Algorithm: acceptance-rejection (Ahrens & Dieter,1982a).

Chi-square

The density function of the chi-squaredistribution is given by:

The deviates are calculated asgamma deviates with parameters n/2 and 2, where n isdegrees of freedom.

Algorithm: Transformed gammadeviates (Ahrens& Dieter, 1974,1982b; Gentle 2003).

F (variance ratio)

The density function of the F distribution is given by:

The deviates are calculated as nx/dz where n is the numerator degrees of freedom, d is thedenominator degrees of freedom, x is a gamma deviate with parameters n/2 and 2(chi-square with n degrees of freedom), and z is a gamma deviate withparameters d/2 and 2 (chi-square with d degrees of freedom).

Algorithm: Transformed gammadeviates (Ahrens& Dieter, 1974, 1982b; Gentle 2003).

Student's t

The density function of Student's tdistribution is given by:

The deviates are calculated as astandard normal deviate multiplied by the square root of the degrees of freedom(n) divided by a gamma deviate with parameters n/2 and 2 (a chi-square deviatewith n degrees of freedom).

Algorithm: Transformed standardnormal and gamma deviates (Ahrens & Dieter,1974, 1982b; Gentle 2003).

Beta

The density function of the betadistribution is given by:

Algorithm: Acceptance-rejectionmethods BB and BC (Cheng1978).

Logistic

The density function of the betadistribution is given by:

Algorithm: Transformed uniformdeviates (Gentle2003).

Cauchy

Algorithm: Transformed uniformdeviates (Gentle2003).

Weibull

Algorithm: Transformed uniformdeviates (Gentle2003).

Geometric