Page 102 - 4660
P. 102

Hypothesis testing


               where Φ(z) is the cumulative distribution function for the standard Gaussian. For a 10%
               significance level we have α = 0.1 and we find ¯x crit = 0.128. Thus the rejection region on ¯x
               is ¯x > 0.128. From the sample, we deduce that ¯x = 1.11, and so we can clearly reject the null
               hypothesis H 0 : µ = 0 at the 10% significance level It can, in fact, be rejected at a much higher
               significance level.


                     Student’s t-test


               Student’s t-test is a special test applied to a sample x 1 , x 2 , . . . , x N drawn independently from a
                                                                              2
               Gaussian distribution for which both the mean µ and variance σ are unknown, and for which one
               wishes to distinguish between the hypotheses

                                                                                     2
                                                      2
                                   H 0 : µ = µ 0 , 0 < σ < ∞, and H 1 : µ ̸= µ 0 , 0 < σ < ∞,
               where µ 0 is a given number. Here, the parameter space A is the half-plane −∞ < µ < ∞, 0 <
                 2
               σ < ∞, whereas the subspace S characterized by the null hypothesis H 0 is the line µ = µ 0 ,
                     2
               0 < σ < ∞. The likelihood function for this situation is given by
                                                                     [ ∑             ]
                                                           1               (x i − µ) 2
                                                 2
                                        L(x; µ, σ ) =            exp −     i          .
                                                      (2πσ )                 2σ 2
                                                           2 N/2
                                                                                              2
                                                                                                   2
                                                      2
               On the one hand, the values of µ and σ that maximise L in A are µ = ¯x and σ = s , where ¯x is
                                      2
               the sample mean and s is the sample variance. On the other hand, to maximise L in the subspace
                                                                                      2
                                                                      2
               S we set µ = µ 0 , and the only remaining parameter is σ ; the value of σ that maximises L is then
               easily found to be
                                                              N
                                                          1  ∑
                                                                         2
                                                    σ =         (x i − µ 0 ) .
                                                     ˆ 2
                                                          N
                                                             i=1
               To retain, in due course, the standard notation for Student’s t-test, in this section we will denote
               the generalised likelihood ratio by λ (rather than t); it is thus given by
                                                   ∑           2 −N/2               [ ∑           2  ] N/2
                           L(x; µ 0 , σ )  [(2π/N)    (x i − µ 0 ) ]  exp(−N/2)           (x i − ¯x)
                                    ˆ 2
                   λ(x) =              =           ∑ i                            = ∑    i                 (3.4)
                                                              2 −N/2
                                    2
                           L(x; ¯x, s )   [(2π/N)     (x i − ¯x) ]   exp(−N/2)           (x i − µ 0 ) 2
                                                      i                                  i
               Normally, our next step would be to find the sampling distribution of λ under the assumption that
               H 0 were true. It is more conventional, however, to work in terms of a related test statistic t, which
               was first devised by William Gosset, who wrote under the pen name of ’Student’.
                   The sum of squares in the denominator of (3.4) may be put into the form
                                          ∑                              ∑
                                                                     2
                                                                                    2
                                                      2
                                             (x i − µ 0 ) = N(¯x − µ 0 ) +  (x i − ¯x) .
                                           i                              i
                                                                              ∑
                                                                                         2
               Thus, on dividing the numerator and denominator in (3.4) by       (x i − ¯x) and rearranging, the
                                                                                i
               generalised likelihood ratio λ can be written
                                                        (        2  ) −N/2
                                                                t
                                                   λ =    1 +              ,
                                                              N − 1
               where we have defined the new variable

                                                              ¯ x − µ 0
                                                        t =   √        .                                   (3.5)
                                                            s/ N − 1

                                                              102
   97   98   99   100   101   102   103   104   105   106   107