The mutual information of two random variables, X and Y, is defined to be

I(X,Y) = Sum(Pr(x,y).log(Pr(x,y)/(Pr(x).Pr(y))))

Where Sum() is the sum over the alphabets of both X and Y. Alternatively, the mutual information can be expressed more intuitively as

I(X,Y) = H(X) - H(X|Y)

Where H(X) is the entropy of X and H(X|Y) is the conditional entropy of X on Y. This can be interpreted as as measure of the reduction in the uncertainty of X when we know the value of Y - in other words a measure of how much we learn about X by finding out the value of Y. The mutual information is symmetric, so we can equivalently write

I(X,Y) = H(Y) - H(Y|X)

which may be easier to evaluate in some cases.

The capacity of a noisy channel is defined to be the maximum value of the mutual information of the input and output variables with respect to input probability distribution.

Y'know, if you log in, you can write something here, or contact authors directly on the site. Create a New User if you don't already have an account.