After reading this post, I think you will find it easy to explain this picture from Wikipedia:

From Wikipedia,

In statistics, the Pearson product-moment correlation coefficient (sometimes referred to as the PMCC, and typically denoted by ) is a measure of the correlation (linear dependence) between two variables X and Y, giving a value between +1 and −1 inclusive.

The definition of Pearson coefficient is

where is the covariance between X and Y, and and are the standard deviations.

For your reference, standard deviation is the square root of variance, and the estimate of variance from n samples is:

The estimate of covariance between X and Y is

The estimate of Pearson coefficient is

The denominator is simply a non-negative normalization factor, and only the nominator (the covariance) measures the correlation between X and Y. So let us have a close look at covariance.

Consider a sample of four symmetric bivariate points : (We consider bivariate samples here only because they can be shown on the 2D screen.)

It is not hard to verify that the estimate of covariance from this sample is 0. Note that if a point is in the 1st and 3rd orthant, then is positive, otherwise it is negative. So generally, given a set of points distributed symmetrically around their mean, we have covariance and thus .

To make or larger than 0, we need more points in the 1st and 3rd orthants, for example:

Similarly, to make or less than 0, we need more points in the 2nd and 4th orthants. However large (or small) the value of is, the denominator of $\rho_{X,Y}$ normalizes the result between [-1,1].

The covariance matrix of X and Y is

which contains all the factors used to compute .

An experiment is to construct a covariance matrix with extremely large given :

then we sample 1000 points from a Gaussian distribution , plot the sampling result:

You see, all samples are in the sample line.

If we relax to be 0.9, the result is:

If we use negative , say -0.9, then we have

and

All above plots have mean slope is either 1 or -1. How can we have other slope values? The result is to change the bounding box defined by or . For example

Note that the covariance matrix must be positive-definite, which, for any non-zero vector z, has . In bivariate case, consider z=[a,b], the constraint is equivalent to

If , the left-side of above equation can be written as a square-form and is thus . If we want , we need

This constraints the Pearson coefficient between [-1,1]. When is 1 or -1, all sample points of X and Y are on a line, and we say X and Y are linearly dependent.

Plots in this post were drawn using the following MATLAB command:

M = sample_gaussian([0,0,], [2,sqrt(2);sqrt(2),1], 1000); cla; scatter(M(:,1), M(:,2)); axis tight

where function `sample_gaussian` can be found in my previous post.