Page 90 - 4660
P. 90

Estimation


               where s x and s y are the sample standard deviations of the x i and y i respectively and r xy is the
               sample correlation. In the special case when the parent population P(x, y) is Gaussian, it may be
               shown that, if ρ = Corr[x, y],

                                                                   2
                                                            ρ(1 − ρ )        −2
                                              E[r xy ] = ρ −          + O(N    ),                         (2.12)
                                                               2N

                                                          1
                                                                   2 2
                                              Var(r xy ) =  (1 − ρ ) + O(N   −2 ),                        (2.13)
                                                          N
                                                                                      [
               from which the expectation value and variance of the estimator Corr[x, y] may be found
               immediately.
                   We conclude our discussion of basic estimators by reconsidering the set of experimental data.



               Example 2.4. 10 Ukrainian citizens are selected at random and their heights and
               weights are found to be as follows (to the nearest cm or kg respectively):
                       Person        A     B     C     D     E     F     G     H     I     J
                    Height (cm) 194 168 177           180 171     190   151 169    175 182
                    Weight (kg)     75    53    72    80    75    75    57    67    46    68
                   Estimate the means, µ x and µ y , and standard deviations, σ x and σ y , of the two-
               dimensional joint population from which the sample was drawn, quoting the standard
               error on the estimate in each case. Estimate also the correlation Corr[x, y] of the
               population, and quote the standard error on the estimate under the assumption that
               the population is a multivariate Gaussian.                                                     ,


               Solution. Above we computed various sample statistics for these data. In particular, we found
               that for our sample of size N = 10, ¯x = 175.7, ¯y = 66.8, s x = 11.6, s y = 10.6, r xy = 0.54.
               Let us begin by estimating the means µ x and µ y . The sample mean is an unbiased, consistent
                                                                                                        √
               estimator of the population mean. Moreover, the standard error on ¯x (say) is σ x / N. In
               this case, however, we do not know the true value of σ x and we must estimate it using σ x =
               √
                  N/(N − 1)s x . Thus, our estimates of µ x and µ y , with associated standard errors, are

                                                            s x
                                               ˆ µ x = ¯x ± √     = 175.7 ± 3.9,
                                                           N − 1

                                                            s y
                                                ˆ µ y = ¯y ± √     = 66.8 ± 3.5.
                                                           N − 1
               We now turn to estimating σ x and σ y . As just mentioned, our estimate of σ x (say) is ˆσ x =
               √
                   N  s x . Its variance is given approximately by
                  N−1

                                                              (               )
                                                           1         N − 3
                                              Var(ˆσ) ≈         ν 4 −       ν 2 2  .
                                                        4Nν 2        N − 1
               Since we do not know the true values of the population central moments ν 2 and ν 4 , we must
                                                                                     2
                                                                                             2
               use their estimated values in this expression. We may take ˆν 2 = ˆσ = (ˆσ) 2, which we have
                                                                                     x
               already computed. It still remains, however, to estimate ν 4 . It is acceptable to take ˆν 4 = n 4 .
               Thus for the x i and y i values, we have
                                                          N
                                                       1  ∑
                                                                     4
                                              (ˆν 4 ) x =    (x i − ¯x) = 53411.6
                                                       N
                                                          i=1
                                                              90
   85   86   87   88   89   90   91   92   93   94   95