This Quantum World/Appendix/Probability
Probability is a numerical measure of likelihood. If an event has a probability equal to 1 (or 100%), then it is certain to occur. If it has a probability equal to 0, then it will definitely not occur. And if it has a probability equal to 1/2 (or 50%), then it is as likely as not to occur.
You will know that tossing a fair coin has probability 1/2 to yield heads, and that casting a fair die has probability 1/6 to yield a 1. How do we know this?
There is a principle known as the principle of indifference, which states: if there are n mutually exclusive and jointly exhaustive possibilities, and if, as far as we know, there are no differences between the n possibilities apart from their names (such as "heads" or "tails"), then each possibility should be assigned a probability equal to 1/n. (Mutually exclusive: only one possibility can be realized in a single trial. Jointly exhaustive: at least one possibility is realized in a single trial. Mutually exclusive and jointly exhaustive: exactly ony possibility is realized in a single trial.)
Since this principle appeals to what we know, it concerns epistemic probabilities (a.k.a. subjective probabilities) or degrees of belief. If you are certain of the truth of a proposition, then you assign to it a probability equal to 1. If you are certain that a proposition is false, then you assign to it a probability equal to 0. And if you have no information that makes you believe that the truth of a proposition is more likely (or less likely) than its falsity, then you assign to it probability 1/2. Subjective probabilities are therefore also known as ignorance probabilities: if you are ignorant of any differences between the possibilities, you assign to them equal probabilities.
If we assign probability 1 to a proposition because we believe that it is true, we assign a subjective probability, and if we assign probability 1 to an event because it is certain that it will occur, we assign an objective probability. Until the advent of quantum mechanics, the only objective probabilities known were relative frequencies.
The advantage of the frequentist definition of probability is that it allows us to measure probabilities, at least approximately. The trouble with it is that it refers to ensembles. You can't measure the probability of heads by tossing a single coin. You get better and better approximations to the probability of heads by tossing a larger and larger number of coins and dividing the number of heads by The exact probability of heads is the limit
The meaning of this formula is that for any positive number however small, you can find a (sufficiently large but finite) number such that
The probability that events from a mutually exclusive and jointly exhaustive set of possible events happen is the sum of the probabilities of the events. Suppose, for example, you win if you cast either a 1 or a 6. The probability of winning is
In frequentist terms, this is virtually self-evident. approximates approximates and approximates
The probability that two independent events happen is the product of the probabilities of the individual events. Suppose, for example, you cast two dice and you win if the total is 12. Then
By the principle of indifference, there are now equiprobable possibilities, and casting a total of 12 with two dice is one of them.
It is important to remember that the joint probability of two events equals the product of the individual probabilities and only if the two events are independent, meaning that the probability of one does not depend on whether or not the other happens. In terms of propositions: the probability that the conjunction is true is the probability that is true times the probability that is true only if the probability that either proposition is true does not depend on whether the other is true or false. Ignoring this can have the most tragic consequences.
The general rule for the joint probability of two events is
is a conditional probability: the probability of given that
To see this, let be the number of trials in which both and happen or are true. approximates approximates and approximates But
An immediate consequence of this is Bayes' theorem:
The following is just as readily established:
where happens or is true whenever does not happen or is false. The generalization to mutually exclusive and jointly exhaustive possibilities should be obvious.
Given a random variable, which is a set of random numbers, we may want to know the arithmetic mean
as well as the standard deviation, which is the root-mean-square deviation from the arithmetic mean,
The standard deviation is an important measure of statistical dispersion.
Given possible measurement outcomes with probabilities we have a probability distribution and we may want to know the expected value of defined by
as well as the corresponding standard deviation
which is a handy measure of the fuzziness of .
We have defined probability as a numerical measure of likelihood. So what is likelihood? What is probability apart from being a numerical measure? The frequentist definition covers some cases, the epistemic definition covers others, but which definition would cover all cases? It seems that probability is one of those concepts that are intuitively meaningful to us, but — just like time or the experience of purple — cannot be explained in terms of other concepts.