# Probability/Introduction

## Overview

Probability theory provides a mathematical model for the study of randomness and uncertainty. Many important decisions, whether from business, government, science, recreation or even one's personal life must be made with incomplete information or some degree of uncertainty. Hence, a formalized study of uncertain or random outcomes occupies an important role in modern society. In situations where one of any number of possible outcomes may occur, the mathematical model of probability theory offers methods for quantifying the likelihoods associated with those outcomes. Probability also provides tools which allow us to move beyond simply describing the information contained within a set of data (descriptive statistics) to actually inferring further information from that data (inferential statistics). Many of the early attempts to model likelihood arose from games of chance. For a brief history of probability see this Wikipedia article.

Although probability theory is now a very formal branch of mathematics, the language of probability is often used informally in everyday speech. We express our beliefs about likelihoods of outcomes in situations involving uncertainty using intuition guided by our experiences and in some cases statistics. Consider the following examples:

• Bill says "Don't buy the avocados here; about half the time, they're rotten". Bill is expressing his belief about the probability of an event — that an avocado will be rotten — based on his personal experience.
• Lisa says "I am 95% certain the capital of Spain is Barcelona". Here, the belief Lisa is expressing is only a probability from her point of view, because only she does not know that the capital of Spain is Madrid (from our point of view, the probability is 100%). However, we can still view this as a subjective probability because it expresses a measure of uncertainty. It is as though Lisa is saying "in 95% of cases where I feel as sure as I do about this, I turn out to be right".
• Susan says "There is a lower chance of being shot in Omaha than in Detroit". Susan is expressing a belief based (presumably) on statistics.
• Dr. Smith says to Christina, "There is a 75% chance that you will live." Dr. Smith is basing this off of his research.
• Nicolas says "It will probably rain tomorrow." In this case the likelihood that it will rain is expressed in vague terms and is subjective, but implies that the speaker believes it is greater than ${\displaystyle {\frac {1}{2}}}$ (or 50%). Subjective probabilities have been extensively studied, especially with regards to gambling and securities markets. While this type of probability is important, it is not the subject of this book. A good reference is "Degrees of Belief" By Steven Vick (2002).

Notice that in the previous examples the likelihood of any particular outcome is expressed as a percentage (between 0% and 100%), as is common in everyday language. However, probabilities in formal probability theory are always expressed as real numbers in the interval ${\displaystyle [0,1]}$ (e.g. a probability of .25 may be expressed as 25%, or a probability of ${\displaystyle {\frac {1}{\pi }}}$ may be expressed as approximately 31.83%). Other differences exist between common expressions of probabilities and formal probability theory. For example, a probability of 0% is typically taken to mean that the event to which that probability is assigned is impossible. However, in probability theory (usually in cases where there are infinitely many possible outcomes) an event ascribed a probability of zero may actually occur. In some situations, it is certain that such an event will occur (e.g. in selecting a real number between 0 and 1, the probability of selecting any given number is zero, but it is certain that one such number will be selected).

Another way to express the probability of an outcome is by its odds: the ratio of the probability of "success" (event occurs) to the probability of "failure" (event does not occur). In gambling odds are expressed as the ratio of the stakes risked by each participant in a wager. For instance: a bookmaker offering odds of 3 to 1 "against" a horse will pay a punter three times their stake (if the horse wins). In fact, the bookmaker (ignoring factors such as his potential need to "lay off" bets which are exposing him to the possibility of an unacceptable overall loss) is announcing that he thinks the horse has a ${\displaystyle {\frac {1}{4}}}$ chance of winning. If we express odds as "chance of winning": "chance of not winning", then 3 to 1 against would be represented as ${\displaystyle 1:3={\frac {1}{4}}:{\frac {3}{4}}}$ or ${\displaystyle {\frac {1}{3}}}$ . So an event with a probability of ${\displaystyle {\frac {1}{4}}}$ or 25% has odds of 33%. This disparity is even more clear where an event has a probability of 50% (e.g., the odds of a coin showing heads is 50%:50% = 1:1 or ${\displaystyle {\frac {1}{2}}}$).

## Types of probability

As mentioned earlier, probability can be expressed informally in a variety of different ways, but even formal definitions and approaches vary. The most general and rigorous approach is known as axiomatic probability theory, which will be the focus of later chapters. Here we briefly discuss a few other approaches, their uses and limitations. All of these approaches rely in one way or another on the concept of an experiment. Recall that probability provides means to study randomness and uncertainty.

An experiment is any action or process whose outcome is subject to uncertainty or randomness.

Here the term experiment is used in a wider sense than its usual connotation with controlled laboratory situations. Further clarification on experiments will be given later, but for now the following examples of experiments will suffice:

• observing whether or not a commercial product is defective.
• tossing a coin one or more times or selecting a card from a card deck.
• conducting a survey.
• measuring the wind speed or rainfall in a particular area.

Assuming that an experiment can be repeated under identical conditions, then each repetition of an experiment is called a trial.

### Basic Concepts

There are two standard approaches to conceptually interpreting probabilities: the relative frequency approach and the subjective belief (or confidence approach). In the Frequency Theory of Probability, probability is the limit of the relative frequency with which certain outcomes occur in repeated trials (note that the outcome of any single trial cannot depend on the outcome of other trials). The relative frequency approach requires that experiments be random and that all possible outcomes be known before execution of the experiment. The probability of any set of outcomes is expressed as the relative frequency with which those outcomes will occur among many repeated trials.

Physical probabilities fall within the category of objective or frequency probabilities, and are associated with random physical systems such as roulette wheels, rolling dice and radioactive atoms. In such systems, a given outcome (such as a die yielding a six) tends to occur at a persistent rate, or 'relative frequency', in a long run of trials. Physical probabilities either explain, or are invoked to explain these stable frequencies.

Relative frequency probabilities are always expressed as a figure between 0% (the outcome essentially never happens) and 100% (the outcome essentially always happens), or similarly as a figure between 0 and 1. According to the Frequency Theory of Probability, saying that "the probability that A occurs is p%" means that if you repeat the experiment many times under essentially identical conditions, the percentage of time for which A occurs will converge to p. For example, a 50% chance that a coin lands "heads up" means that if you toss the coin over and over again, then the ratio of times the coin lands heads to the total number of tosses approaches a limiting value of 50% as the number of tosses grows. Notice that the outcome of one toss never depends on another toss, and that the ratio of heads to total number of tosses is always between 0% and 100%.

In the Subjective Theory of Probability, probability measures the speaker's "degree of belief" that a set of outcomes will result, on a scale of 0% (complete disbelief that the event will happen) to 100% (certainty that the event will happen). According to the Subjective Theory, saying that "the probability that A occurs is ${\displaystyle {\frac {2}{3}}}$ " means that I believe that A will happen twice as strongly as I believe that A will not happen. The Subjective Theory is particularly useful in assigning meaning to the probability of outcomes that in principle can occur only once. For example, how might one assign meaning to the following statement: "there is a 25% chance of an earthquake on the San Andreas fault with magnitude 8 or larger before 2050"? It would be very hard to qualify this measure in terms of relative frequency.

One way to represent an individual's degree of belief in a statement, given available evidence, is with the Bayesian approach. Evidential probability, also called Bayesian probability, can be assigned to any statement whatsoever, even when no random process is involved. On most accounts evidential probabilities are considered degrees of belief, defined in terms of dispositions to gamble at certain odds. The primary evidential interpretations include the classical interpretation, the subjective interpretation, the epistemic or inductive interpretation, and the logical interpretation.

The next several sections discuss the principal theories within the relative frequency approach to probability.

### Classical theory of probability

The classical approach to probability expresses probability as a ratio of the number of favorable outcomes in a series of successive trials to the number of total possible outcomes. Note the immediate implication that the number of total possible outcomes be known. Furthermore, all possible outcomes are assumed to be equally probably and no two possible outcomes can both result from the same trial. Here, the term "favorable" is not subjective, but rather indicates that an outcome belongs to a group of outcomes of interest. This group of outcomes is called an event, which will be formalized with the introduction of axiomatic probability theory.

 Classical definition of probability If the number of outcomes belonging to an event ${\displaystyle E}$ is ${\displaystyle N_{E}}$ , and the total number of outcomes is ${\displaystyle N}$ , then the probability of event ${\displaystyle E}$ is defined as ${\displaystyle p_{E}={\frac {N_{E}}{N}}}$ .

For example, a standard deck of cards (without jokers) has 52 cards. If we randomly draw a card from the deck, we can think of each card as a possible outcome. Therefore, there are 52 total outcomes. We can now look at various events and calculate their probabilities:

• Out of the 52 cards, there are 13 clubs. Therefore, if the event of interest is drawing a club, there are 13 favorable outcomes, and the probability of this event is ${\displaystyle {\frac {13}{52}}={\frac {1}{4}}}$ .
• There are 4 kings (one of each suit). The probability of drawing a king is ${\displaystyle {\frac {4}{52}}={\frac {1}{13}}}$ .
• What is the probability of drawing a king OR a club? This example is slightly more complicated. We cannot simply add together the number of outcomes for each event separately (${\displaystyle 4+13=17}$) as this inadvertently counts one of the outcomes twice (the king of clubs). The correct answer is ${\displaystyle {\frac {16}{52}}}$ from ${\displaystyle {\frac {13}{52}}+{\frac {4}{52}}-{\frac {1}{52}}}$ where this is essentially ${\displaystyle p({\text{club}})+p({\text{king}})-p({\text{king of clubs}})}$ .

Classical probability suffers from a serious limitation. The definition of probability implicitly defines all outcomes to be equiprobable. While this might be useful for drawing cards, rolling dice, or pulling balls from urns, it offers no method for dealing with outcomes with unequal probabilities.

This limitation can even lead to mistaken statements about probabilities. An often given example goes like this:

I could be hit by a meteor tomorrow. There are two possible outcomes: I will be hit, or I will not be hit. Therefore, the probability I will be hit by a meteor tomorrow is ${\displaystyle {\frac {1}{2}}=50\%}$ .

Of course, the problem here is not with the classical theory, merely the attempted application of the theory to a situation to which it is not well adapted.

This limitation does not, however, mean that the classical theory of probability is useless. At many points in the development of the axiomatic approach to probability, classical theory is an important guiding factor.

### Empirical or Statistical Probability or Frequency of occurrence

This approach to probability is well-suited to a wide range of scientific disciplines. It is based on the idea that the underlying probability of an event can be measured by repeated trials.

 Empirical or Statistical Probability as a measure of frequency Let ${\displaystyle n_{A}}$ be the number of times event ${\displaystyle A}$ occurs after ${\displaystyle n}$ trials. We define the probability of event ${\displaystyle A}$ as ${\displaystyle p_{A}=\lim _{n\to \infty }{\frac {n_{A}}{n}}}$

It is of course impossible to conduct an infinite number of trials. However, it usually suffices to conduct a large number of trials, where the standard of large depends on the probability being measured and how accurate a measurement we need.

A note on this definition of probability: How do we know the sequence ${\displaystyle {\frac {n_{A}}{n}}}$ in the limit will converge to the same result every time, or that it will converge at all? The unfortunate answer is that we don't. To see this, consider an experiment consisting of flipping a coin an infinite number of times. We are interested in the probability of heads coming up. Imagine the result is the following sequence:

HTHHTTHHHHTTTTHHHHHHHHTTTTTTTTHHHHHHHHHHHHHHHHTTTTTTTTTTTTTTTT...

with each run of ${\displaystyle k}$ heads and ${\displaystyle k}$ tails being followed by another run twice as long. For this example, the sequence ${\displaystyle {\frac {n_{A}}{n}}}$ oscillates between roughly ${\displaystyle {\frac {1}{3}}}$ and ${\displaystyle {\frac {2}{3}}}$ and doesn't converge.

We might expect such sequences to be unlikely, and we would be right. It will be shown later that the probability of such a run is 0, as is a sequence that converges to anything other than the underlying probability of the event. However, such examples make it clear that the limit in the definition above does not express convergence in the more familiar sense, but rather some kind of convergence in probability. The problem of formulating exactly what this means belongs to axiomatic probability theory.

### Axiomatic probability theory

Although axiomatic probability theory is often frightening to beginners, it is the most general approach to probability and has been employed in tackling some of the more difficult problems in probability. It begins with a set of axioms which, although not immediately intuitive, are guided by the more familiar classical probability theory. These axioms are discussed in the (as yet unwritten) following chapter.