R Programming/Probability Functions/Binomial
| A Wikibookian suggests that this book or chapter be merged into Statistics/Distributions/Binomial.
Please discuss whether or not this merge should happen on the discussion page.
The Binomial Distribution
- The sum of N Bernoulli trials (all with common success probability)
- The number of heads in N tosses of possibly-unfair coin.
- Of N oocysts truly present in a sample of water, the number actually counted, given each has same recovery probability.
- This distribution has 2 parameters (N and P), though we usually know the number of trials (N), so only one parameter is unknown (P).
Probability Mass Function
- dbinom(K,N,P), where K is the number of success, N is the number of trials, and P is the probability of success.
- dbinom(5,10,0.5) = 0.2460938
Distribution Function 
- pbinom(5,10,0.5) = 0.6230469
Generating Random Variables 
- rbinom(12,10,0.5) -> 5 5 7 5 5 6 7 6 6 6 4 7
- hist(rbinom(1000,10,0.5)) --> histogram
- hist(rbinom(1000,10,0.5), breaks = seq(from=-0.5, to=12.5)) will put integer values at bar centers (rather than at bar-right.
Parameter Estimation 
Most of the time, we get to count the number of trials, so that parameter (N) is known. We observe the number of positives (K) and use this information to estimate the unobserved "success" probability (P).
- Sum of M binomials is same as sum of M*N Bernoulli Trials = binom(M*N,P)
- Maximum Likelihood
- lambda = sum(successes)/sum(trials) = sum(K)/sum(N)
- Normal Approximation
- Exact Confidence Interval
BUT, what if the number of trials is not known for M binomial trials? Can we used the data K through K[M] to estimate both N and P?