# A First Course to Network Coding/Printable version

A First Course to Network Coding

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/A_First_Course_to_Network_Coding

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

# How to Use this Book

As readers may already noticed, there are one or two sentences under each chapter's title. It briefly summarize the content of said chapter, so readers have a general idea of what they are getting into when click in.

The background section, as the name indicates, is a supplementary section for look-ups. There's no need to finish the section before entering the main body of this textbook. Rather, when you get stuck on a new term or find a concept blurry in memory, then open this chapter and look up for a definition, example or explanation. An alternative way is to seek explanation from Wikipedia, the free encyclopedia. The articles will be longer, sure, but more elaborate and rigorous due to mass review and edition.

# Probability: Concepts and Useful Equations

## Random Variable

In probability and statistics, a random variable, aleatory variable or stochastic variable is a variable whose value is subject to variations due to chance (i.e. randomness, in a mathematical sense).

## Mutual Independence

In probability theory, to say that two events are independent means that the occurrence of one does not affect the probability of the other. Similarly, two random variables are independent if the realization of one does not affect the probability distribution of the other.

## I.I.D

In w:probability theory and w:statistics, a w:sequence or other collection of random variables is independent and identically distributed (i.i.d.) if each random variable has the same w:probability distribution as the others and all are mutually independent.

# Communication Model and Network Concepts

## Communication Model and Network

### Abstract

In this chapter, we lay the basics in communication model and network, so that readers have a big picture in mind when we go on to study the components of communication. First we introduce the Shannon-Weaver model of communication. Secondly, we introduce a network, including relevant concepts and definitions, performance measure, etc.

### Abbreviations and Acronyms

 SW model Shannon-Weaver model of communication Codec Coding and decoding

Communication is a very broad subject. It involves the information to be shared, the means to do so (such as talking, composing a melody, texting), the mechanism of error correction, and so on so forth. In this book, we refer to communication in an engineering sense: communication is the process of transmitting information. To study communication, it is essential to simplify the process of communication to an easily understandable model, and quantify the components. In , Shannon and Weaver developed a very well-known model of communication that has since taken after their names. This model contains six major components:

1. Information source, which produce a message;
2. Transmitter, which converts the message into forms that are suitable for transmission: an e-mail, for instance, translate a message into a digital stream of ‘1’s and ‘0’s that can be sent via internet;
3. Channel, through which the message is conveyed. It could be a physical connection, like a landline, or a logical one, like one you establish when you call someone and they pick up;
4. Receiver, which reconstruct the original message from the transmitted data;
5. Destination, where the message arrives;
6. Noise source, a dysfunctional, yet important element of communication: it interferes with the data during transmission and makes it difficult to reconstruct the original message.

# Information Measure and Handy Inequations

## Abstract

This chapter aims to introduce the concepts and some basic properties of Shannon’s information measure - entropy, mutual information, and relative entropy. Inequalities involving these concepts will be introduced, too, to help reinforce readers’ understanding and enlarge their toolbox for future learning.

## Introduction

To understand network coding, one need background knowledge in information theory. Among the essential concepts of information theory, a fundamental one is Shannon’s information measure - entropy, mutual information and relative entropy. Instead of considering semantic aspects of information, the entropy measures the uncertainty of the outcome of a discrete random variable. Mutual information, as its name suggests, is how much information one d.r.v. contains about the other, or how much uncertainty one d.r.v. 'kills' for the other. Then we come to relative entropy, which can be seen as a generalization of mutual information. It measures the 'distance' between two d.r.v.s' probability distribution. After the three basic concepts are delivered, we introduce the inequalities that further characterize the properties of the information measures.

To characterize uncertainty, consider a coin flip. If the coin is not fair, say, the head has a 0.8 chance to turn up, while the tail only has a 0.2 chance. Then your uncertainty is low because you are quite certain that the head will turn up. However, if it is a fair coin, you will be more uncertain because the head and tail are equally likely to turn up. So if you were to use a mathematical expression to characterize the uncertainty of an outcome, this expression should be of the form of expectation, and reach its maximum when all choices are equally likely. Moreover, if we use dice instead of coin, or roulette instead of dice, then the uncertainty should grow as the number of choices grow. That is, the expression characterizing it should be increasing with the number of choices, i.e. the size of the alphabet. Lastly, if you roll two dices at once, the uncertainty of '3' and '4' turn up should be the same as that of rolling one dice then the other. So if sequential choices yield the same outcome as a single one, then the expressions of these choices should be able to combine and the combined result should be equal to that of the single choice. The three intuitive properties described above serves as criteria for choosing such expression.

## Definitions

Discrete Random Variable Let $X$ be a discrete random variable with alphabet $??$ and PMF $P_{X}(x)=Pr(X=x),x\in X.$ Hereon, $P_{X}(x)$ is often written as $P(x)$ for convenience, and discrete random variable as d.r.v.

Entropy The entropy of a d.r.v. is defined as:

$H(X)=-\sum P(x)\log _{b}P(x)$ , in other words, the expectation of $log_{b}{\frac {1}{P(x)}}$ .

Letter 'b' in the base specifies the unit of entropy. If b = 2, the unit is bit (binary unit, note the difference between this and the other 'bit' -- binary digit); b = e, it's nat (natural unit). Unless specified otherwise, we use bit. This form of expectation of negative logarithm is tailor-made by Shannon for the three criteria we described in the last section.

Joint Entropy We defined entropy of a single d.r.v., the definition of two d.r.v.s is immediate:

$H(X)=-\sum _{x,y\in S(X,Y)}P(x,y)\log P(x,y)$ .

Conditional Entropy The entropy of a d.r.v. conditioning on another d.r.v. is defined as:

$H(X|Y=y)=-\sum _{x\in X}P(x|y)\log P(x|y)$ .

Now we can express the third criterion with conditional entropy:

$H(XY)=H(X)+H(Y|X)=H(Y)+H(X|Y)$ . Rolling two dice at once or rolling one after the other should have the same uncertainty. And this can be extended to more than two d.r.v.s:

$H(X_{1},X_{2},...,X_{N})=\sum _{n=1}^{N}H(X_{N}|X_{1},X_{2},...,X_{N-1})$ . This is called the chain rule of entropy.

Convex Function

Non-negativity

## Inequalities

Jensen's inequality