The Average Code-Word Length for a source \(S\) with a set of source symbols \(s_k\) could be calculated using the following formula:
$$ \bar{L} = \sum^{K-1}_{k=0} p_k l_k $$
Where:
- \(\bar{L}\) is the average number of bits per source symbol used in source encoding
- \(K\) is the size of the source alphabet \(S_k\)
- \(k\) is the Set# ${0, 1, \ldots, K - 1}$
- \(p_k\) is the probability of symbol \(s_k\)
- \(l_K\) is the length of binary code word assigned to symbol \(s_k\)
From here, we could determine the coding efficiency of the Source Encoder# as follows:
$$ \eta = \frac{L_{min}}{\bar{L}} \le 1 $$
Where:
- \(L_{min}\) is the minimum possible value of \(\bar{L}\).
The #Source Coding could be said to be efficient when \(\eta \rightarrow 1\). If the original information isn’t lost after the source coding by Source Encoder#, then the output (source code) is considered to be lossless.
If the information source is #Discrete Memoryless Source (DMS), then \(\bar{L}\) is bounded by the source’s Entropy# \(H(\phi)\), as defined by Shannon’s first theorem. And we can conclude that \(L_{min}\) is actually \(H(\phi)\).
$$ \begin{align} \bar{L} &\ge H(\phi)\\ L_{min} &= H(\phi)\\ \eta &= \frac{H(\phi)}{\bar{L}} \end{align} $$