Page Contents

Toggle# Video lectures on information-theory

Information theory concerns communications between transmitters and receivers. Transmitters aim to send some information (data) to receivers at the greatest possible rate while keeping the probability of error sufficiently small. Information theory gives expressions of the greatest possible rate C (called capacity) corresponding to a vanishing probability of error. These expressions are particularly useful for designing and optimizing communication networks.

The first and most famous example of such expression is the capacity of the additive white Gaussian noise (AWGN) channel, due to Shannon (1948). Since Shannon’s paper, a lot of work has been done to extend his results for a wide variety of channels and situations. In particular, the framework developed by Shannon permits to study the case of finite-alphabets and the extension to general-alphabets is usually done by heuristic discretization arguments. More recently, network information theory considers the case where there may be several transmitters (broadcast channel) or receivers (multiple access channel).

## I. Coding problem, entropy and mutual information

### Lecture 1: Coding problem (Information-Theory)

**PDF**, **Vidéo**In the present lecture, we shall present the coding problem in the case of discrete-alphabets. The general problem is called the source-channel coding problem. But it will be helpful to consider separately the source coding and the channel coding.

**Outline**:

Coding problem

Source coding

Channel coding

Source-channel coding

**Contenu**: Channel coding and source coding, Claude Shannon information theory, information theory Claude Shannon, Claude e Shannon information theory, Claude Shannon theory of information, Shannon information theory, channel coding rate, source coding, channel coding, information theory Claude Shannon 1948, Shannon Claude information theory, Shannon information theory explained, Shannon theory of communication, information theory

### Lecture 2: Entropy

**PDF**We introduce Shannon’s entropy of a probability distribution on a discrete space (finite or countable) and study its basic properties. This notion is essential to prove Shannon’s coding theorems. In particular, Shannon’s source coding theorem allows to interpret the entropy as a notion of the amount of information « carried » by random variables.

**Outline**:

Entropy

Conditional entropy

Fano’s inequality

Entropy rate

**Contents**: Shannon’s entropy, Gibbs inequality, Entropy of random vectors.

### Lecture 3: Mutual information

**PDF**We shall introduce mutual information, between random variables on a discrete space (finite or countable) and study its basic properties. This notion is essential to prove source and channel coding theorems.

**Outine**:Kullback-Leibler divergence

Mutual information

Conditional mutual information

**Contents**: Mutual information versus Kullback-Leibler divergence, Mutual information versus entropy, Data processing inequality, Kolmogorov’s formula, Chain rule for mutual information, Csiszar sum identity.

## II. Source coding

### Lecture 4: Source coding theorem

**PDF**Roughly speaking, the source coding theorem states that the entropy of a source equals the number of bits needed to encode a source-symbol. In other words, we can achieve a coding of the source with a rate close to the entropy of a symbol and with evanescent error probability; and no better coding rate is possible. We shall make this statement more precise.

**Outline**:

Source coding capacity

Source coding theorem

Compression point of view

Typical sequences

Source coding theorem proof

Proof of achievability

Proof of converse

### Lecture 5: Error-free source coding

**PDF**We prove in the present lecture a theorem which strengthens the achievability part of Shannon’s source coding theorem stated in the previous lecture. More specifically, we shall introduce some error-free source coding schemes and prove that in this class the entropy still indicates the infimum of achievable coding rates. The price to pay for error-free coding is variable codeword length. The main tool is Kraft’s code.

**Outline**:

Error-free source coding

Uniquely decipherable codes

Codes on trees and Kraft’s inequality

Error-free source coding theorem

**Contents**: Uniquely decipherable codes, Prefix-free codes, Codes on trees, Kraft’s inequality, Kraft’s code. .

### Lecture 6: Optimal source codes

**PDF**We shall construct codes based on the probability distribution of the source-symbol:

- Huffman’s code which is optimal in the sense of minimizing the mean codeword length for a given source distribution.
- Shannon-Fano-Elias code whose asymptotic average length (when coding a source-word of length going to infinity) is optimal.

**Outline**:

Huffman’s optimal code

Optimal code

Huffman’s code

Optimality of the Huftman’s code

Shannon-Fano-Elias code

**Contents**: Huffman’s code, Shannon–Fano–Elias code.

### Lecture 7: Parsing-translation and Tunstall’s codes

**PDF**

In the previous lesson we have shown that coding-rates close to the entropy of the source can be achieved by grouping source symbols in fixed-length blocks (words), then coding these source-words using some codes (e.g. Kraft’s, Huffman’s) optimized for the block distribution.

In this lesson, we shall see that equally small coding-rates can be achieved by some optimal (for the given source distribution) grouping of the source symbols in variable-length blocks, called parsing, then encoding these blocks with some simple, non-optimized codes.

This is the idea of Tunstall’s code proposed in 1967.

**Outline**:

Parsing-translation code

Valid parsing book

Mean formulae for a valid parsing book

Tunstall’s code

### Lecture 8: Universal source coding

**PDF**

Universal source coding consists in coding without prior knowledge of the source distribution. We shall show that for any coding rate R there exist universal codes (not depending on the source distribution) allowing for asymptotically error-free coding of any i.i.d source having the entropy smaller than R. This may be seen as an extension of Shannon’s first source coding theorem.

**Outline**:

Universal source coding

Empirical distributions

Proof of universal source coding

### Lecture 9: Lempel-Ziv source code

**PDF**

We shall present the Lempel-Zif algorithm introduced in 1977/78, which allows one to encode without errors an arbitrary source sequence (error-free universal source code).

**Outline**:

Lempel-Ziv algorithm

Coding automat

Parsing-entropy

Optimality of Lempel-Ziv coding

Coding-rates of IL finite automats

Optimality theorems of Lempel-Ziv

## III. Channel coding

### Lecture 10: Channel information capacity

**PDF**

We shall introduce a basic model of noisy communication channel and define its information capacity as the maximum of the mutual information — a measure of dependence derived from the notion of entropy — between the input and the output of the channel. Two illustrative examples are studied: binary symmetric channel and binary erasure channel.

**Outline**:

Channel basic properties

Information capacity

Examples of channel’s information capacity

### Lecture 11: Channel coding theorem

**PDF**

We introduce a general channel coding scheme, with its transmission rate and error probability. We prove a fundamental result that all transmission rates smaller than the channel information capacity (as introduced in the previous lecture) can be achieved by some coding schemes offering arbitrarily small error probabilities. This is the famous noisy-channel coding theorem formulated by Shannon in 1948. The proof makes use of a random coding argument and a decoding based on jointly typical sequences.

**Outline**:

Channel coding theorem statement

Typical sequences for two random variables

Inequalities for mutual information

Proof of achievability

Random coders

Decoding

Channel outputs

Analysis of the error probability

Proof of converse

**Contents**: Shannon coding theorem, channel coding theorem, jointly typical sequences, Fano’s inequality, channel capacity, channel coding rate, channel coding in digital communication, channel coding and source coding, channel’s coding capacity, achievable coding rate, supremum of achievable rates, set of achievable rates, Discrete alphabet channel coding theorem, Mutual information inequality, coding theorem achievability, coding theorem converse.

## Exercises

**Exercises 1**for Lecture 2 « Entropy »**Exercises 2**for Lecture 5 « Error-free source coding » (Uniquely decipherable codes)**Exercises 3**for Lecture 6 « Optimal source codes »

### Solutions

## Teaching

Teaching at Computer Science department of Ecole Normale Supérieure (ULM, Paris) with Bartłomiej Blaszczyszyn :

**Course title**: « Information, coding theories: Source, channel coding, Shannon theorems » (2016-2024).**Course location**: ENS, 45, rue d’Ulm 75230 Paris Cedex 05. E. Noether room in the 2nd basement of the “AILE RATAUD” building; cf. ENS map.

### Schedule (2024)

- 2024/02/06:
- 2024/02/27:
- Lecture 5 « Uniquely decipherable codes ».
- Exercises 1, Solutions 1.

- 2024/03/05:
- Lecture 6 « Optimal codes ».
- Exercises 2.

## Book

B. Błaszczyszyn, M.K. Karray (2024). Information theory. Book in preparation.