Page 66 - 4660
P. 66
Random vectors
where σ X and σ Y are the standard deviations of X and Y respectively. It can be shown that the
correlation function lies between −1 and +1. If the value assumed is negative, X and Y are said
to be negatively correlated, if it is positive they are said to be positively correlated and if it is zero
they are said to be uncorrelated. We will now justify the use of these terms.
One particularly useful consequence of its definition is that the covariance of two independent
variables, X and Y, is zero. It immediately follows from (7.10) that their correlation is also zero,
and this justifies the use of the term ’uncorrelated’ for two such variables. To show this extremely
important property we first note that
Cov[X, Y ] = E[(X − µ X )(Y − µ Y )] = E[XY − µ X Y − µ Y X + µ X µ Y ] =
= E[XY ] − µ X E[Y ] − µ Y E[X] + µ X µ Y = E[XY ] − µ X µ Y . (7.11)
Now, if X and Y are independent then E(XY ) = E(X)E(Y ) = µ X µ Y and so Cov[X, Y ] = 0. It is
important to note that the converse of this result is not necessarily true; two variables dependent
on each other can still be uncorrelated. In other words, it is possible (and not uncommon) for two
variables X and Y to be described by a joint distribution f(x, y) that cannot be factorized into a
product of the form g(x)h(y), but for which Corr[X, Y ] = 0. Indeed, from the definition (7.9), we
see that for any joint distribution f(x, y) that is symmetric in x about µX (or similarly in y) we have
Corr[X, Y ] = 0.
We have already asserted that if the correlation of two random variables is positive (negative)
they are said to be positively (negatively) correlated. We have also stated that the correlation lies
between −1 and +1. The terminology suggests that if the two RVs are identical (i.e. X = Y ) then
they are completely correlated and that their correlation should be +1. Likewise, if X = −Y then
the functions are completely anticorrelated and their correlation should be −1. Values of the
correlation function between these extremes show the existence of some degree of correlation.
In fact it is not necessary that X = Y for Corr[X, Y ] = 1; it is sufficient that Y is a linear function
of X, i.e. Y = aX + b (with a positive). If a is negative then Corr[X, Y ] = −1. To show this we
first note that µY = aµX + b. Now
Y = aX + b = aX + µY − aµX ⇒ Y − µY = a(X − µX),
and so using the definition of the covariance (7.9)
2
2
Cov[X, Y ] = aE[(X − µX) ] = aσ .
X
It follows from the properties of the variance that σ Y = |a|σ X and so, using the definition (7.10) of
the correlation,
aσ 2 a
Corr[X, Y ] = X = ,
|a|σ 2 |a|
x
which is the stated result.
It should be noted that, even if the possibilities of X and Y being non-zero are mutually
exclusive, Corr[X, Y ] need not have value ±1.
Example 7.3. A biased die gives probabilities 1 p, p, p, p, p, 2p of throwing 1, 2,
2
3, 4, 5, 6 respectively. If the random variable X is the number shown on the die
2
and the random variable Y is defined as X , compute the covariance and correlation
of X and Y. ,
2
Solution. We have already computed that p = 2 , E(X) = 53 , E(X ) = 253 , and Var(X) = 480 .
13 13 13 169
Using (7.11)
2
3
2
Cov[X, Y ] = Cov[X, X ] = E(X ) − E(X)E(X ).
66