4. Probability Distributions: Practical 4
Binomial random variables
The binomial distribution is the distribution that describes the number of successful outcomes from a sequence of independent experiments with two possible outcomes (like flipping a coin, which would give either heads or tails). This distribution has two parameters: 1) the number of experiments that are conducted and 2) the probability to get a successful outcome for a single experiment.
Let's consider two concrete examples.
If you flip a coin times, then the number of times you get 'heads' has a binomial distribution. Coins will have a probability of to give heads. So most times you will just get times 'heads', and occasionally the number of heads will be very low (like or time) or high ( or times).
The exams for some courses are in the form of 'multiple choice' questions. Let's consider the case that you would have to take an exam with MC questions which each has options. In an extreme case where you wouldn't have learned anything and just filled-in your answers randomly. What would then be the chance to pass the exam with correct answers? In this example, the number of correct answers has a binomial distribution and through the binomial distribution, you can calculate the probability of passing the exam by pure guesswork.
A binomial random variable is discrete since it can only take on positive integer values (, , , etc.). Note that this is different from a normally distributed random variable which is continuous and can take on any real value (, , etc.). If a probability distribution is discrete, then the probability density function is called a probability mass function, because it does give actual probabilities to get a specific outcome for the binomial random variable.
The commands in R for the binomial distribution for a random variable are:
dbinom()
- to calculate the probability that you get exactly successes (in experiments, where the theoretical probability of success is )pbinom()
- to calculate the probability that the number of successes is within a given intervalqbinom()
- to calculate threshold values for the number of successes that will be exceeded with a given probability
The common way to specify that a random variable has a binomial distribution, based on experiments, with a theoretical probability of success of is as follows.
The following example illustrates how the distribution (the probability mass function) of a random variable can be plotted.
Create a plot of the probability mass distribution of a binomial distribution .
x <- 0:10Now you can use the
dbinom()
function to calculate the probability mass at each of these values of () for the specified binomial distribution. These will be your y-values.y <- dbinom(x, size=10, prob=0.5)
The last step is the actual plotting. Here, we used the parameter option
type='h'
to specify that vertical bars should be plotted at each of the values for x (like a barplot, but then only with bars at the allowable x-values).plot(x=x, y=y, type='h')

Now, create a plot of the cumulative distribution function for the binomial distributions .
x <- 1:10Now you use the
pbinom()
function to calculate the cumulative probability at each of these values of () for the specified binomial distributions. These will be your y-values.cp <- pbinom(x, size=10, prob=0.5)Once, you have calculated the x- and y-values, you are ready for plotting!
plot(x=x, y=cp, type='s')

In this graph, the parameter option
type='s'
states that the data should be plotted as a 'staircase plot'. The probabilities add-up in a stepwise fashion. In general, for discrete probability distributions, you would need to specify the plotting options type='h'
and type='s'
for the probability mass and cumulative probability plots respectively.