Page 95 - 4660
P. 95
Confidence interval on the mean of a normal distribution, variance unknown
Large-Sample Confidence Interval for population mean. We have assumed that the
population distribution is normal with unknown mean and known standard deviation σ. We now
present a large-sample CI for µ that does not require these assumptions. Let X 1 , X 2 , . . . , X n be a
2
random sample from a population with unknown mean µ and variance σ . Now if the sample size
¯
n is large, the central limit theorem implies that X has approximately a normal distribution with
√
¯
2
mean µ and variance σ /n. Therefore, Z = (X − µ)/(σ n) has approximately a standard
normal distribution. This ratio could be used as a pivotal quantity and manipulated as in above
to produce an approximate CI for µ. However, the standard deviation σ is unknown. It turns out
that when n is large, replacing σ by the sample standard deviation S has little effect on the
distribution of Z. This leads to the following useful result.
¯
Large-Sample Confidence Interval on the Mean: when n is large, the quantity X−µ
√
S/ n
has an approximate standard normal distribution. Consequently,
s s
¯ x − z α/2 √ ≤ µ ≤ ¯x + z α/2 √ (2.26)
n n
is a large sample confidence interval for µ, with confidence level of approximately 100(1 − α)%.
Equation(2.26)holdsregardlessoftheshapeofthepopulationdistribution. Generallynshould
be at least 40 to use this result reliably. The central limit theorem generally holds for n ≥ 30, but
the larger sample size is recommended here because replacing σ by S in Z results in additional
variability.
Large-Sample Confidence Interval for a Parameter. The large-sample confidence interval for
µ in (2.26) is a special case of a more general result. Suppose that θ is a parameter of a probability
ˆ
ˆ
distribution, and let θ be an estimator of θ. If θ(1) has an approximate normal distribution, (2) is
approximately unbiased for θ, and (3) has standard deviation σ ˆ that can be estimated from the
θ
ˆ
sample data, then the quantity (θ−θ)/σ ˆ has an approximate standard no.rmal distribution. Then
θ
a large-sample approximate CI for θ is given by
ˆ
ˆ
θ − z α/2 σ ˆ ≤ θ ≤ θ + z α/2 σ ˆ. (2.27)
θ θ
Finally, note that Equation (2.27) can be used even when σ ˆ is a function of other unknown
θ
parameters (or of θ). Essentially, all one does is to use the sample data to compute estimates of
the unknown parameters and substitute those estimates into the expression for σ ˆ.
θ
Confidence interval on the mean of a normal distribution, variance unknown
2
When we are constructing confidence intervals on the mean µ of a normal population when σ is
known, we can use the procedure in previous subsection. This CI is also approximately valid
(because of the central limit theorem) regardless of whether or not the underlying population is
normal, so long as n is reasonably large (n ≥ 40, say). Wwe can even handle the case of unknown
variance for the large-sample-size situation. However, when the sample is small and σ is
2
unknown, we must make an assumption about the form of the underlying distribution to obtain
a valid CI procedure. A reasonable assumption in many cases is that the underlying distribution is
normal.
Many populations encountered in practice are well approximated by the normal distribution,
so this assumption will lead to confidence interval procedures of wide applicability. In fact,
moderate departure from normality will have little effect on validity. When the assumption is
unreasonable, an alternative is to use nonparametric statistical procedures that are valid for any
underlying distribution.
95