Statistics/Distributions/Poisson

From Wikibooks, open books for an open world
< Statistics‎ | Distributions
Jump to: navigation, search

Poisson Distribution[edit]

Poisson
Probability mass function
Plot of the Poisson PMF
The horizontal axis is the index k, the number of occurrences. The function is only defined at integer values of k. The connecting lines are only guides for the eye.
Cumulative distribution function
Plot of the Poisson CDF
The horizontal axis is the index k, the number of occurrences. The CDF is discontinuous at the integers of k and flat everywhere else because a variable that is Poisson distributed only takes on integer values.
Notation \mathrm{Pois}(\lambda)\,
Parameters λ > 0 (real)
Support k ∈ { 0, 1, 2, 3, ... }
PMF \frac{\lambda^k}{k!}\cdot e^{-\lambda}
CDF \frac{\Gamma(\lfloor k+1\rfloor, \lambda)}{\lfloor k\rfloor !}\! --or-- e^{-\lambda} \sum_{i=0}^{\lfloor k\rfloor} \frac{\lambda^i}{i!}\

(for k\ge 0 where \Gamma(x, y)\,\! is the Incomplete gamma function and \lfloor k\rfloor is the floor function)

Mean \lambda\,\!
Median \approx\lfloor\lambda+1/3-0.02/\lambda\rfloor
Mode \lfloor\lambda\rfloor,\,\lceil\lambda\rceil - 1
Variance \lambda\,\!
Skewness \lambda^{-1/2}\,
Ex. kurtosis \lambda^{-1}\,
Entropy \lambda[1\!-\!\log(\lambda)]\!+\!e^{-\lambda}\sum_{k=0}^\infty \frac{\lambda^k\log(k!)}{k!}

(for large \lambda) \frac{1}{2}\log(2 \pi e \lambda) - \frac{1}{12 \lambda} - \frac{1}{24 \lambda^2} -
                     \frac{19}{360 \lambda^3} + O(\frac{1}{\lambda^4})

MGF \exp(\lambda (e^{t}-1))\,
CF \exp(\lambda (e^{it}-1))\,
PGF  \exp(\lambda(z - 1))\,

Any French speaker will notice that "Poisson" means "fish", but really there's nothing fishy about this distribution. It's actually pretty straightforward. The name comes from the mathematician Siméon-Denis Poisson (1781-1840).

The Poisson Distribution is very similar to the Binomial Distribution. We are examining the number of times an event happens. The difference is subtle. Whereas the Binomial Distribution looks at how many times we register a success over a fixed total number of trials, the Poisson Distribution measures how many times a discrete event occurs, over a period of continuous space or time. There isn't a "total" value n. As with the previous sections, let's examine a couple of experiments or questions that might have an underlying Poisson nature.

  • How many pennies will I encounter on my walk home?
  • How many children will be delivered at the hospital today?
  • How many mosquito bites did you get today after having sprayed with insecticide?
  • How many angry phone calls did I get after airing a particularly distasteful political ad?
  • How many products will I sell after airing a new television commercial?
  • How many people, per hour, will cross a picket line into my store?
  • How many alien abduction reports will be filed this year?
  • How many defects will there be per 100 metres of rope sold?

What's a little different about this distribution is that the random variable X which counts the number of events can take on any non-negative integer value. In other words, I could walk home and find no pennies on the street. I could also find one penny. It's also possible (although unlikely, short of an armored-car exploding nearby) that I would find 10 or 100 or 10,000 pennies.

Instead of having a parameter p that represents a component probability like in the Bernoulli and Binomial distributions, this time we have the parameter "lambda" or λ which represents the "average or expected" number of events to happen within our experiment. The probability mass function of the Poisson is given by

P(N=k)=\frac{e^{-\lambda}\lambda^k}{k!}.

An example[edit]

We run a restaurant and our signature dish (which is very expensive) gets ordered on average 4 times per day. What is the probability of having this dish ordered exactly 3 times tomorrow? If we only have the ingredients to prepare 3 of these dishes, what is the probability that it will get sold out and we'll have to turn some orders away?

The probability of having the dish ordered 3 times exactly is given if we set k=3 in the above equation. Remember that we've already determined that we sell on average 4 dishes per day, so λ=4.

P(N=k)=\frac{e^{-\lambda}\lambda^k}{k!} = \frac{e^{-4} 4^3}{3!} = 0.195

Here's a table of the probabilities for all values from k=0..6:

Value for k Probability f(k)
0 0.0183
1 0.0733
2 0.1465
3 0.1954
4 0.1954
5 0.1563
6 0.1042

Now for the big question: Will we run out of food by the end of the day tomorrow? In other words, we're asking if the random variable X>3. In order to compute this we would have to add the probabilities that X=4, X=5, X=6,... all the way to infinity! But wait, there's a better way!

The probability that we run out of food P(X>3) is the same as 1 minus the probability that we don't run out of food, or 1-P(X≤3). So if we total the probability that we sell zero, one, two and three dishes and subtract that from 1, we'll have our answer. So,

1 - P(X≤3) = 1 - ( P(X=0) + P(X=1) + P(X=2) + P(X=3) ) = 1 - 0.4335 = 0.5665

In other words, we have a 56.65% chance of selling out of our wonderful signature dish. I guess crossing our fingers is in order!

Mean[edit]

We calculate the mean as follows:

\operatorname{E}[X] = \sum_i f(x_i) \cdot x_i = \sum^{\infin}_{x=0} \frac{e^{-\lambda}\lambda^x}{x!}x
\operatorname{E}[X] = \frac{e^{-\lambda}\lambda^0}{0!}\cdot 0 + \sum^{\infin}_{x=1} \frac{e^{-\lambda}\lambda^x}{x!}x
\operatorname{E}[X] = 0 + e^{-\lambda} \sum^{\infin}_{x=1} \frac{\lambda \lambda^{x-1}}{(x-1)!}
\operatorname{E}[X] = \lambda e^{-\lambda}\sum^{\infin}_{x=1} \frac{\lambda^{x-1}}{(x-1)!}
\operatorname{E}[X] = \lambda e^{-\lambda}\sum^{\infin}_{x=0} \frac{\lambda^x}{x!}

Remember that \mathrm{e}^{\lambda} = \sum^{\infin}_{x=0} \frac{\lambda^x}{x!}

\operatorname{E}[X] = \lambda e^{-\lambda}e^{\lambda}=\lambda

Variance[edit]

We derive the variance using the following formula:

\operatorname{Var}[X] = \operatorname{E}[X^2] - (\operatorname{E}[X])^2

We have already calculated E[X] above, so now we will calculate E[X2] and then return to this variance formula:

\operatorname{E}[X^2] = \sum_i f(x_i) \cdot x^2
\operatorname{E}[X^2] = \sum^{\infin}_{x=0} \frac{e^{-\lambda}\lambda^x}{x!}x^2
\operatorname{E}[X^2] = 0+\sum^{\infin}_{x=1} \frac{e^{-\lambda}\lambda \lambda^{x-1}}{(x-1)!}x
\operatorname{E}[X^2] = \lambda\sum^{\infin}_{x=0} \frac{e^{-\lambda}\lambda^x}{x!}(x+1)
\operatorname{E}[X^2] = \lambda\left[\sum^{\infin}_{x=0} \frac{e^{-\lambda}\lambda^x}{x!}x+\sum^{\infin}_{x=0} \frac{e^{-\lambda}\lambda^x}{x!}\right]

The first sum is E[X]=λ and the second we also calculated above to be 1.

\operatorname{E}[X^2] = \lambda\left[\lambda+1\right]=\lambda^2+\lambda

Returning to the variance formula we find that

\operatorname{Var}[X] = (\lambda^2+\lambda) - (\lambda)^2=\lambda

External links[edit]