Mutual Information

Since #Entropy represents the uncertainty about the channel input before observing the channel output, and #Conditional Entropy represents the uncertainty about the channel input after observing the channel output, we could remove the noise uncertainty by:

$$ I(A_X;A_Y) = H(A_X) - H(A_X|A_Y) $$

Which is the definition of Mutual Information, representing the uncertainty about the channel input that is resolved by observing the channel output.

It has several properties:

  • the Mutual Information of a channel is symmetric, that is \(I(A_X;A_Y) = I(A_Y;A_X)\)
  • it is always non-negative (\(I(A_X;A_Y) \ge 0\)), as we can’t lose information
  • \(I(A_X;A_Y) = H(A_X) + H(A_Y) - H(A_X,A_Y)\) (Join Entropy)

Note: If \(I(A_X;A_Y) = 0\), then the input and output symbols of the channel are statistically independent.

It depends not only on the channel itself, but also on they way in which the channel is used, that is on the input \(p(x_j)\).

Links to this page
#math