This tutorial explains how to work with the geometric distribution in R using the following functions
- dgeom: returns the value of the geometric probability density function.
- pgeom: returns the value of the geometric cumulative density function.
- qgeom: returns the value of the inverse geometric cumulative density function.
- rgeom: generates a vector of geometric distributed random variables.
Here are some examples of cases where you might use each of these functions.
dgeom
The dgeom function finds the probability of experiencing a certain amount of failures before experiencing the first success in a series of Bernoulli trials, using the following syntax:
dgeom(x, prob)
where:
- x: number of failures before first success
- prob: probability of success on a given trial
Here’s an example of when you might use this function in practice:
A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. What is the probability that the fourth person the researcher talks to is the first person to support the law?
dgeom(x=3, prob=.2) #0.1024
The probability that the researchers experiences 3 “failures” before the first success is 0.1024.
pgeom
The pgeom function finds the probability of experiencing a certain amount of failures or less before experiencing the first success in a series of Bernoulli trials, using the following syntax:
pgeom(q, prob)
where:
- q: number of failures before first success
- prob: probability of success on a given trial
Here’s are a couple examples of when you might use this function in practice:
A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. What is the probability that the researcher will have to talk to 3 or less people to find someone who supports the law?
pgeom(q=3, prob=.2) #0.5904
The probability that the researcher will have to talk to 3 or less people to find someone who supports the law is 0.5904.
A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. What is the probability that the researcher will have to talk to more than 5 people to find someone who supports the law?
1 - pgeom(q=5, prob=.2) #0.262144
The probability that the researcher will have to talk to more than 5 people to find someone who supports the law is 0.262144.
qgeom
The qgeom function finds the number of failures that corresponds to a certain percentile, using the following syntax:
qgeom(p, prob)
where:
- p: percentile
- prob: probability of success on a given trial
Here’s an example of when you might use this function in practice:
A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. We will consider a “failure” to mean that a person does not support the law. How many “failures” would the researcher need to experience to be at the 90th percentile for number of failures before the first success?
qgeom(p=.90, prob=0.2)
#10
The researcher would need to experience 10 “failures” to be at the 90th percentile for number of failures before the first success.
rgeom
The rgeom function generates a list of random values that represent the number of failures before the first success, using the following syntax:
rgeom(n, prob)
where:
- n: number of values to generate
- prob: probability of success on a given trial
Here’s an example of when you might use this function in practice:
A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. We will consider a “failure” to mean that a person does not support the law. Simulate 10 scenarios for how many “failures” the researcher will experience until she finds someone who supports the law.
set.seed(0) #make this example reproducible
rgeom(n=10, prob=.2)
# 1 2 1 10 7 4 1 7 4 1
The way to interpret this is as follows:
- During the first simulation, the researcher experienced 1 failure before finding someone who supported the law.
- During the second simulation, the researcher experienced 2 failures before finding someone who supported the law.
- During the third simulation, the researcher experienced 1 failure before finding someone who supported the law.
- During the fourth simulation, the researcher experienced 10 failures before finding someone who supported the law.
And so on.
Additional Resources
An Introduction to the Geometric Distribution
Geometric Distribution Calculator