Page 101 - 4660
P. 101

The Neyman-Pearson test


                   We consider first the choice of rejection region. Even in the general case, in which the test
               statistic t is a multi-dimensional (vector) quantity, the Neyman-Pearson lemma states that, for a
               given significance level α, the rejection region for H 0 giving the highest power for the test is the
               region of t-space for which
                                                         P(t|H 0 )
                                                                  > c,                                     (3.2)
                                                         P(t|H 1 )

               where c is some constant determined by the required significance level.
                   In the case where the test statistic t is a simple scalar quantity, the Neyman-Pearson lemma
               is also useful in deciding which such statistic is the ’best’ in the sense of having the maximum
               power for a given significance level α. From (3.2), we can see that the best statistic is given by the
               likelihood ratio
                                                               P(x|H 0 )
                                                        t(x) =                                             (3.3)
                                                               P(x|H 1 )
               and that the corresponding rejection region for H 0 is given by t < t c rit. In fact, it is clear that any
               statistic u = f(t) will be equally good, provided that f(t) is a monotonically increasing function
               of t. The rejection region is then u < f(t crit ). Alternatively, one may use any test statistic v = g(t)
               where g(t) is a monotonically decreasing function of t; in this case the rejection region becomes
               v > g(t crit ). To construct such statistics, however, one must know P(x|H 0 ) and P(x|H 1 ) explicitly,
               and such cases are rare.

               Example 3.1. Ten independent sample values x i , i = 1, 2, . . . , 10, are drawn at
               random from a Gaussian distribution with standard deviation σ = 1. The mean µ of
               the distribution is known to equal either zero or unity. The sample values are as
               follows:
                                    2.22 2.56 1.07 0.24 0.18 0.95 0.73 − 0.79 2.09 1.81
               Test the null hypothesis H 0 : µ = 0 at the 10% significance level.                            ,

               Solution. The restricted nature of the hypothesis space means that our null and alternative
               hypotheses are H 0 : µ = 0 and H 1 : µ = 1 respectively. Since H 0 and H 1 are both simple
               hypotheses, the best test statistic is given by the likelihood ratio (3.3). Thus, denoting the
               means by µ 0 and µ 1 , we have
                                             1  ∑          2          1  ∑   2             2
                                       exp[−      (x i − µ 0 ) ]  exp[−    (x − 2µ 0 x i + µ )]
                                                                             i
                                                                                           0
                               t(x) =        2 1  ∑ i        =        2  ∑ i                  =
                                                                      1
                                                                             2
                                                                                           2
                                                           2
                                       exp[−      (x i − µ 1 ) ]  exp[−    (x − 2µ 1 x i + µ )]
                                             2   i                    2   i  i             1
                                                            ∑        1
                                                                                2
                                                                           2
                                            = exp[(µ 0 − µ 1 )  x i − N(µ − µ )].
                                                                     2     0    1
                                                              i
                                                                                   1
               Inserting the values, µ 0 = 0, and µ 1 = 1, yields t = exp(−N ¯x + N), where ¯x is the sample
                                                                                   2
               mean. Since − ln t is a monotonically decreasing function of t, however, we may equivalently
               use as our test statistic
                                                            1       1
                                                     ν = −    ln t +  = ¯x,
                                                           N        2
               where we have divided by the sample size N and added      1  for convenience. Thus we may take
                                                                         2
               the sample mean as our test statistic. We know that the sampling distribution of the sample
                                                                                           2
               mean under our null hypothesis H 0 is the Gaussian distribution N(µ 0 , σ /N), where µ 0 = 0,
                 2
               σ = 1 and N = 10. Thus ¯x ∼ N(0, 0.1).
                   Since ¯x is a monotonically decreasing function of t, our best rejection region for a given
               significance α is ¯x > ¯x crit , where ¯x crit depends on α. Thus, in our case, ¯x crit is given by
                                                     (           )
                                                       ¯ x crit − µ 0
                                          α = 1 − Φ                = 1 − Φ(10¯x crit ),
                                                           σ
                                                              101
   96   97   98   99   100   101   102   103   104   105   106