This is the basic z-test that is used to determine whether a sample mean, which provides the best estimate of the population mean, μ, has a true value a, when the population standard deviation, σ, is known (or where a good estimate, s, is available based on a sample size n>30). If the standard deviations are not known and sample sizes are small, t-tests rather than z-tests should be used.

Assumptions: The sample is random and the population is Normally distributed

Hypothesis: H0: μ=a; H1: μ≠a

Test: Compute the z-statistic:

z is distributed approximately N(0,1). The denominator in these expressions is the standard deviation of the mean, or standard error. The larger the sample size, n, the smaller will be the standard error (for a given variance), which in turn means the z-statistic will be larger for a given difference between the sample mean and a. This observation can be used to help determine the sample size necessary to achieve specific levels of risk (see sample size, below).

If the population is finite, of size N, the z-statistic should be amended to:

This correction increases the size of the standard z-statistic for all N>n>1, thus smaller differences tend to be more significant in finite populations than infinite populations. If n=1 both formulas reduce to

so for a sample of 1 the z-statistic is simply the z-transform of the observed value.

Examples: If the observed value obtained is z=+1.96, the cumulative Normal distribution will give the probability that a result as large as 1.96 will occur at random. The expression in Excel is: =NORMDIST(1.96,0,1,1) which give the result 0.975 or 97.5%. Thus 2.5% of samples will show a difference of this size or larger (one-sided test). Likewise, if z=-1.96 the result would be 0.025, thus 2.5% of results would be this size or smaller. As a two-sided test the total probability of a difference of this magnitude (ignoring the sign) is 5%.

Confidence intervals: The 100(1- α)% confidence intervals for the sample mean are:

where k=1 for an infinite population, or as defined above for a finite population. Thus the 95% confidence interval corresponds to a z-value of +/-1.96 times the standard error.

Sample size:

If we seek a confidence interval of a specific size, say +/-T ; we can equate this to the confidence interval expression and solve for n:

Thus for a 95% confidence interval, and a standard deviation of 1, say, we can specify T to determine n, and experiment with variations in T to see how this affects the sample size required. For example, with T=0.1 n=384, whereas with T=0.5 n=15. Notice that it is the ratio of the standard deviation to the size of confidence interval sought that determines the sample size required.

Suppose the true value of the mean is b, not a. There is a risk that we will accept a as the true value (or rather, not reject a) when in fact b is the true value (a Type II error). The size of the difference between b and a can be described as some multiple, h, of the standard error:

and we can solve this expression for n, the sample size:

Let us assume that a>b, and we have effectively two alternative hypotheses:

H0: μ=a - this is our main hypothesis, for which we are only prepared to accept a small risk, e.g. 5%, that we reject this hypothesis when it is in fact true (a Type I error) based on an estimated value c which is less than a; and

H1: μ=b - this is the alternative hypothesis, for which we are perhaps prepared to accept a larger risk, e.g. 20%, that the hypothesis is rejected when it is true based on an estimated value c which is which is greater than b. We can then use the Normal distribution probabilities associated with a one-sided test of 5% and 20% respectively to give us :

Equating these two expressions we have

and solving for n we have the expression we obtained earlier, although in this instance the denominator is a-b, with h=1.645+0.85. For example, if a=1, b=0, and σ=1, we have n=6.2, so a sample size of 7 would be sufficient. Note that from this result we can also obtain a value for c, which in this case is 0.32, thus any sample mean value >0.32 would be sufficient to accepting (not rejecting) H0 .

Special graphs, known as Operating Characteristic curves, provide plots of the relationship between sample size and the two main types of error (see Ferris et. al., 1946, [FER1] for a number of such charts covering χ2, F, Normal or z-tests and t-tests).

References

[FER1] Ferris C D, Grubbs F E, Weaver C L (1946) Operating Characteristics for the Common Statistical Tests of Significance. Annals of Mathematical Stats, 17(2), 178-197