Power and Sample Size Overview
see also
Use Minitab's power and sample size capabilities to evaluate power
and sample size before you design and run an experiment (prospective)
or after you perform an experiment (retrospective).
· A
prospective study is used before
collecting data to consider design sensitivity. You want to be sure that
you have enough power to detect differences (effects) that you have determined
to be important. For example, you can increase the design sensitivity
by increasing the sample size or by taking measures to decrease the error
variance.
· A
retrospective study is used after
collecting data to help understand the power of the tests that you have
performed. For example, suppose you conduct an experiment and the data
analysis does not reveal any statistically significant
results. You can then calculate power based on the minimum difference
(effect) you wish to detect. If the power to detect this difference is
low, you may want to modify your experimental design to increase the power
and continue to evaluate the same problem. However, if the power is high,
you may want to conclude that there is no meaningful difference (effect)
and discontinue experimentation.
Minitab provides power, sample size, and difference (effect) calculations
(also the number of center points for factorial and Plackett-Burman
designs)
for the following procedures:
You can also calculate sample size or margin of error for a parameter
estimate or a tolerance
interval.
What is Power?
Power is the likelihood that you will identify a significant difference
(effect) when one truly exists. There are four possible outcomes for a
hypothesis test.
The outcomes depend on whether the null hypothesis (H0)
is true or false and whether you decide to "reject" or "fail
to reject" H0. The power of
a test is the probability of correctly rejecting H0
when it is false.
The four possible outcomes are summarized below:
|
Null Hypothesis |
Decision |
True |
False |
fail to reject H0 |
correct decision
p = 1 - a |
Type II error
p = b |
reject H0 |
Type I error
p = a |
correct decision
p
= 1 - b |
When H0 is true and you reject
it, you make a type I error.
The probability (p) of making a Type I error is called alpha
(a) and is sometimes referred to as the level of significance for the test.
When H0 is false and you fail to
reject it, you make a type II error.
The probability (p) of making a type II error is called beta
(b).
Choosing probability levels
When you are determining the a and b values for your test, you should consider the
· severity of making an error-The more serious the error, the less often you
will be willing to allow it to occur. Therefore, you should assign smaller
probability values to more serious errors.
· magnitude of effect you want to detect-Power is
the probability (p = 1 - b) of correctly rejecting
H0 when it is false. Ideally, you
want to have high power to detect a difference that you care about, and
low power for a meaningless difference.
For example, suppose you manufacture storage containers
and want to evaluate a new, potentially more heat-tolerant plastic. The
expense is worth considering if the new plastic increases the mean melting
point of your product by 20°
or more. Testing more samples increases the chance of detecting such a
difference, but testing too many samples increases time and expense and
can result in detecting unimportant differences. You could use Power
and Sample Size for 2-Sample t
to estimate how many samples are needed in order to detect a difference
of 20°
with sufficient power.
Factors that influence power
A number of factors influence power:
· a, the probability of a Type I error (also called
the level of significance). As a increases,
the probability of a type II error (b) decreases.
Hence, as a increases, power (which equals 1 - b) also increases.
· s, the variability in the population
(or experimental variability). As s decreases,
power increases.
· the
size of the effect. As the size of the effect increases, power increases.
· sample
size. As sample size increases, power increases.