# Probability/Random Variables

## Definition[edit | edit source]

**Definition.**
(Random variable)
Formally, a *random variable* on a probability space is a measurable real function
defined on (the set of possibleh outcomes)

**Remark.**

- The property of measurability means that for each real , the set

- , i.e. is an event in the probability space.

- Measurability will not be emphasized in this book.
- In some definitions, the codomain of the random variable is defined as , namely the extended real number line.
- Usually, a capital letter is used to represent a random variable, and the small corresponding letter is used to represent a value taken by the random variable, e.g. is a value taken by random variable .

Since random variable maps the outcomes in to a certain number, it can *quantify* the outcomes in , which can be useful.
Another function which is related to random variable in some sense is *indicator function*. It is useful in many situations.

**Definition.**
(Indicator function) For each statement (which is usually an event) ,
the indicator function is

**Example.**
Let be the number of heads facing up from tossing an unfair coin one time.
Then, is a *random variable*, since

*still*a

*random variable*, since is , and , even if only contains one element.

**Remark.**
Actually, .

**Proposition.**
(Properties of indicator function)
For each event ,

- (Complementary event)

For each event ,

- (Intersection of events)
- (Union of events) for mutually exclusive events

**Proof.**
Outline:

Complementary event: is true is false, and is true is false.

Intersection of events: when one of events is false, is false, and the product at right hand side becomes zero as well.

Union of events: since the events are mutually exclusive, at most one of the events is true, so the sum of the right hand side cannot be larger than 1. Also, if one of the events is true, then the union of events is also true, and the sum at right hand side becomes one as well.

## Cumulative distribution function[edit | edit source]

**Definition.**

(Cumulative distribution function)
The *cumulative distribution function* (cdf) of random variable is

**Remark.**

- Cdf
*completely*determine the random behaviour of a random variable.

**Example.**
Suppose we toss a coin two times, then the sample space is in which means head and tail come up in first and second toss respectively, other notations are defined similarly.
If we define to be the number of heads, and

**Proof.**
For cdf of ,
first, , for each and .

If , since .

If , since .

If , since .

If , since .

Similarly, we can get the desired cdf of , by considering for in different ranges.

**Remark.**
Graphically, the cdf of and is *step function*.

In the following, we will discuss three *defining* properties of cdf.

**Theorem.**
(Defining properties of cdf)
A function is the cdf of a random variable *if and only if*

(i) for each real number .

(ii) is nondecreasing.

(iii) is right-continuous.

**Proof.**
Only if part ( is cdf these three properties):

(i) It follows the axioms of probability since is defined to be a probability.

(ii)

(iii) Fix an arbitrary positive sequence with . Define for each positive number . It follows that . Then,

If part is more complicated. The following is optional. Outline:

- Draw an arbitrary curve satisfying the three properties.
- Throw a fair coin infinitely many times.
- Encode each result into a binary number, e.g.
- Transform each binary number to a decimal number, e.g. . Then, the decimal number is a random variable .
- Use this decimal number as the input of the inverse function of the arbitrarily drawn curve, and we get a value, which is also a random variable, say .
- Then, we obtain a cdf of the random variable , if we throw a fair coin infinitely many times.

Sometimes, we are only interested in the values such that , which are more 'important'.
Roughly speaking, the values are actually the elements of the *support* of , which is defined in the following.

**Definition.**
(Support of random variable)
The *support* of a random variable , , is the smallest closed set such that .

**Remark.**

- E.g. closed interval is closed set.
- Closedness will not be emphasized in this book.
- Practically, (which is the smallest closed set).

- is
*probability mass function*for discrete random variables; - is
*probability density function*for continuous random variables. - The terms mentioned above will be defined later.

- is

**Example.**
If

**Remark.**
etc. also satisfy the requirement, but they are not the smallest set.

## Discrete random variables[edit | edit source]

**Definition.**
(Discrete random variables)
If is *countable* (i.e. 'enumerable' or 'listable'), then the random variable is a *discrete*
random variable.

**Example.**
Let be the number of successes among Bernoulli trials.
Then, is a discrete random variable, since which is countable.

On the other hand, if we let be the temperature on Celsius scale, is not discrete, since which is not countable.

Often, for discrete random variable, we are interested in the probability that the random variable takes a specific value. So, we have a function that gives the corresponding probability for each specific value taken, namely *probability mass function*.

**Definition.**

(Probability mass function) Let be a discrete random variable. The probability mass function (pmf) of is

**Remark.**

- Alternative names include mass function and probability function.
- If random variable is discrete, then (it is closed).
- The cdf of random variable is . It follows that the sum of the value of pmf at each inside the support equals one.
- The cdf of a discrete random variable is a step function with jumps at the points in , and the size of each jump defines the pmf of at the corresponding point in .

**Example.**
Suppose we throw a fair six-faced dice one time. Let be the number facing up.
Then, pmf of is

## Continuous random variables[edit | edit source]

Suppose is a discrete random variable. Partitioning into small disjoint intervals gives

Taking limit,

These motivate us to have the following definition.

**Definition.**
(Continuous random variable)
A random variable is *continuous* if

*nonnegative*function .

**Remark.**

- The function is called
*probability density function*(pdf), density function, or probability function (rarely). - If is continuous, then the value of pdf at each
*single value*is zero, i.e. for each real number .

- This can be seen by setting , then (dummy variable is changed).

- By setting , the cdf .
- Measurability will not be emphasized. The sets encountered in this book are all measurable.
- is the area of pdf under , which represents probability (which is obtained by integrating the density function over the set ).

The name *continuous* r.v. comes from the result that the cdf of this kind of r.v. is continuous.

**Proposition.**
(Continuity of cdf of continuous random variable)
If a random variable is continuous, its cdf is also continuous (not just right-continuous).

**Proof.**
Since
(Riemann integral is continuous),
the cdf is continuous.

**Example.**
(Exponential distribution)
The function is a cdf of a continuous random variable since

- It is nonnegative.
- . So, .
- It is nondecreasing.
- It is right-continuous (and also continuous).

**Proposition.**
(Finding pdf using cdf)
If cdf of a continuous random variable is differentiable,
then the pdf .

**Proof.**
This follows from fundamental theorem of calculus:

**Remark.**
Since is nondecreasing, .
This shows that is always nonnegative if is differentiable. It is a motivation for us to define pdf to be nonnegative.

Without further assumption, pdf is *not* unique, i.e. a random variable may have multiple pdf's, since, e.g., we may set the value of pdf to be a real number
at a single point outside its support (without affecting the probabilities, since the value of pdf at a single point is zero regardless of the value), and this makes another valid pdf for a random variable.
To tackle this, we conventionally set for each to make the pdf become unique, and make the calculation more convenient.

**Example.**
(Uniform distribution)
Given that

## Mixed random variables[edit | edit source]

You may think that a random variable can either be discrete or continuous after reading the previous two sections.
Actually, this is wrong. A random variable can be neither discrete nor continuous.
An example of such random variable is *mixed* random variable, which is discussed in this section.

**Theorem.**
(cdf decomposition)
The cdf of each random variable can be decomposed as a sum of three components:

**Remark.**

- If and , then is a mixed random variable.
- We will not discuss singular random variable in this book, since it is quite advanced.
- One interpretation of this formula is:
- If is discrete (continuous) random variable, then ().
- We may also decompose pdf similarly, but we have different ways to find pdf of discrete and continuous random variable from the corresponding cdf.

An example of singular random variable is the Cantor distribution function (sometimes known as Devil's Staircase), which is illustrated by the following graph. The graph pattern keeps repeating when you enlarge the graph.

**Example.**
Let .
Let .
Then, is a cdf of a mixed random variable , with probability to be discrete and probability to be continuous, since
it is nonnegative, nondecreasing, right-continuous and .

**Exercise.**
Consider the function . It is given that is a cdf of a random variable .

(a) Show that .

(b) Show that the pdf of is

(c) Show that the probability for to be continuous is .

(d) Show that is .

(e) Show that the events and are independent if .

**Proof.**

(a) Since is a cdf, and when ,

(b) Since is a mixed random variable, for the discrete random variable part, the pdf is

(c) We can see that can be decomposed as follows:

(d)

(e) If , . Thus,