Page 90 - 4660
P. 90
Estimation
where s x and s y are the sample standard deviations of the x i and y i respectively and r xy is the
sample correlation. In the special case when the parent population P(x, y) is Gaussian, it may be
shown that, if ρ = Corr[x, y],
2
ρ(1 − ρ ) −2
E[r xy ] = ρ − + O(N ), (2.12)
2N
1
2 2
Var(r xy ) = (1 − ρ ) + O(N −2 ), (2.13)
N
[
from which the expectation value and variance of the estimator Corr[x, y] may be found
immediately.
We conclude our discussion of basic estimators by reconsidering the set of experimental data.
Example 2.4. 10 Ukrainian citizens are selected at random and their heights and
weights are found to be as follows (to the nearest cm or kg respectively):
Person A B C D E F G H I J
Height (cm) 194 168 177 180 171 190 151 169 175 182
Weight (kg) 75 53 72 80 75 75 57 67 46 68
Estimate the means, µ x and µ y , and standard deviations, σ x and σ y , of the two-
dimensional joint population from which the sample was drawn, quoting the standard
error on the estimate in each case. Estimate also the correlation Corr[x, y] of the
population, and quote the standard error on the estimate under the assumption that
the population is a multivariate Gaussian. ,
Solution. Above we computed various sample statistics for these data. In particular, we found
that for our sample of size N = 10, ¯x = 175.7, ¯y = 66.8, s x = 11.6, s y = 10.6, r xy = 0.54.
Let us begin by estimating the means µ x and µ y . The sample mean is an unbiased, consistent
√
estimator of the population mean. Moreover, the standard error on ¯x (say) is σ x / N. In
this case, however, we do not know the true value of σ x and we must estimate it using σ x =
√
N/(N − 1)s x . Thus, our estimates of µ x and µ y , with associated standard errors, are
s x
ˆ µ x = ¯x ± √ = 175.7 ± 3.9,
N − 1
s y
ˆ µ y = ¯y ± √ = 66.8 ± 3.5.
N − 1
We now turn to estimating σ x and σ y . As just mentioned, our estimate of σ x (say) is ˆσ x =
√
N s x . Its variance is given approximately by
N−1
( )
1 N − 3
Var(ˆσ) ≈ ν 4 − ν 2 2 .
4Nν 2 N − 1
Since we do not know the true values of the population central moments ν 2 and ν 4 , we must
2
2
use their estimated values in this expression. We may take ˆν 2 = ˆσ = (ˆσ) 2, which we have
x
already computed. It still remains, however, to estimate ν 4 . It is acceptable to take ˆν 4 = n 4 .
Thus for the x i and y i values, we have
N
1 ∑
4
(ˆν 4 ) x = (x i − ¯x) = 53411.6
N
i=1
90