Page 81 - 4660
P. 81
Moments and central moments
Moments and central moments
By analogy with our discussion of probability distributions, the sample mean and variance may
also be described respectively as the first moment and second central moment of the sample. In
general, for a sample x i , i = 1, 2, . . . , N, we define the r-th moment mr and r-th central moment
n r as
N N
1 ∑ ∑
r
m r = x , (1.9)
N i
i=1 i=1
N
1 ∑
r
n r = (x i − m 1 ) . (1.10)
N
i=1
2
Thus the sample mean ¯x and variance s may also be written as m 1 and n 2 respectively. As is
common practice, we have introduced a notation in which a sample statistic is denoted by the
Roman letter corresponding to whichever Greek letter is used to describe the corresponding
population statistic. Thus, we use m r and n r to denote the moment and central moment of a
sample. We denoted the r-th moment and central moment of a population by µr and νr
respectively. This notation is particularly useful, since the r-th central moment of a sample, m r ,
may be expressed in terms of the r-th and lower-order sample moments nr in a way exactly
analogous to that derived for the corresponding population statistics. For example, as discussed
2
2
in the previous section, the sample variance is given by s = x − ¯x but this may also be written
¯ 2
2
2
as n 2 = m 2 − m , which is to be compared with the corresponding relation ν 2 = µ 2 − µ derived
1
1
for population statistics. This correspondence also holds for higher-order central moments of the
sample. For example,
N N
1 ∑ 1 ∑
3
3
3
3
2
n 3 = (x i − m 1 ) = (x − 3m 1 x + 3m x i − m ) =
i
1
1
i
N N
i=1 i=1
3
2
3
= m 3 − 3m 1 m 2 + 3m m 1 − m = m 3 − 3m 1 m 2 + 2m , (1.11)
1 1 1
which may be compared with equation (5.12) in the previous chapter.
We may also describe a sample in terms of the dimensionless quantities
n k n k
g k = = ;
n k/2 s k
2
g 3 and g 4 are called the sample skewness and kurtosis. Likewise, it is common to define the excess
kurtosis of a sample by g 4 − 3.
Covariance and correlation
So far we have assumed that each data item of the sample consists of a single number. Now let us
suppose that each item of data consists of a pair of numbers, so that the sample is given by (x i , y i )
(i = 1, 2, . . . , N).
2
2
We may compute the sample means, ¯x and ¯y, and sample variances, s and s , of the x i and
x
y
y i values individually but these statistics do not provide any measure of the relationship between
the x i and y i . By analogy we measure any interdependence between the x i and y i in terms of the
sample covariance, which is given by
1 ∑
V xy = N(x i − ¯x)(y i − ¯y) = (x − ¯x)(y − ¯y) = ¯xy − ¯x · ¯y. (1.12)
N
i=1
81