Power and Sample Size Overview
see also
 

Use Minitab's power and sample size capabilities to evaluate power and sample size before you design and run an experiment (prospective) or after you perform an experiment (retrospective).

·    A prospective study is used before collecting data to consider design sensitivity. You want to be sure that you have enough power to detect differences (effects) that you have determined to be important. For example, you can increase the design sensitivity by increasing the sample size or by taking measures to decrease the error variance.

·    A retrospective study is used after collecting data to help understand the power of the tests that you have performed. For example, suppose you conduct an experiment and the data analysis does not reveal any statistically significant results. You can then calculate power based on the minimum difference (effect) you wish to detect. If the power to detect this difference is low, you may want to modify your experimental design to increase the power and continue to evaluate the same problem. However, if the power is high, you may want to conclude that there is no meaningful difference (effect) and discontinue experimentation.

Minitab provides power, sample size, and difference (effect) calculations (also the number of center points for factorial and Plackett-Burman designs) for the following procedures:

·    one-sample Z

·    one-sample t

·    two-sample t

·    paired t

·    one variance test

·    two variances test

·    one-sample proportion

·    two-sample proportion

·    one-sample Poisson rate

·    two-sample Poisson rate

·    one-way analysis of variance

·    two-level factorial designs

·    Plackett-Burman designs

·    General Full Factorial designs

 

For equivalence tests, see
Power for Equivalence Tests
.

You can also calculate sample size or margin of error for a parameter estimate or a tolerance interval.

What is Power?

Power is the likelihood that you will identify a significant difference (effect) when one truly exists. There are four possible outcomes for a hypothesis test. The outcomes depend on whether the null hypothesis (H0) is true or false and whether you decide to "reject" or "fail to reject" H0. The power of a test is the probability of correctly rejecting H0 when it is false.

The four possible outcomes are summarized below:

 

Null Hypothesis

Decision

True

False

fail to reject H0

correct decision

p = 1 - a

Type II error

p = b

reject H0

Type I error

p = a

correct decision

p = 1 - b

When H0 is true and you reject it, you make a type I error. The probability (p) of making a Type I error is called alpha (a) and is sometimes referred to as the level of significance for the test.

When H0 is false and you fail to reject it, you make a type II error. The probability (p) of making a type II error is called beta (b).

Choosing probability levels

When you are determining the a and b values for your test, you should consider the

·    severity of making an error-The more serious the error, the less often you will be willing to allow it to occur. Therefore, you should assign smaller probability values to more serious errors.

·    magnitude of effect you want to detect-Power is the probability (p = 1 - b) of correctly rejecting H0 when it is false. Ideally, you want to have high power to detect a difference that you care about, and low power for a meaningless difference.

For example, suppose you manufacture storage containers and want to evaluate a new, potentially more heat-tolerant plastic. The expense is worth considering if the new plastic increases the mean melting point of your product by 20° or more. Testing more samples increases the chance of detecting such a difference, but testing too many samples increases time and expense and can result in detecting unimportant differences. You could use Power and Sample Size for 2-Sample t to estimate how many samples are needed in order to detect a difference of 20° with sufficient power.

Factors that influence power

A number of factors influence power:

·    a, the probability of a Type I error (also called the level of significance). As a increases, the probability of a type II error (b) decreases. Hence, as a increases, power (which equals 1 - b) also increases.

·    s, the variability in the population (or experimental variability). As s decreases, power increases.

·    the size of the effect. As the size of the effect increases, power increases.

·    sample size. As sample size increases, power increases.