A categorical distribution is a discrete probability distribution that describes the probability that a random variable will take on a value that belongs to one of K categories, where each category has a probability associated with it.
For a distribution to be classified as a categorical distribution, it must meet the following criteria:
- The categories are discrete.
- There are two or more potential categories.
- The probability that the random variable takes on a value in each category must be between 0 and 1.
- The sum of the probabilities for all categories must sum to 1.
The most obvious example of a categorical distribution is the distribution of outcomes associated with rolling a dice. There are K = 6 potential outcomes and the probability for each outcome is 1/6:
This distribution satisfies all of the criteria to be classified as a categorical distribution:
- The categories are discrete (e.g. the random variable can only take on discrete values – 1, 2, 3, 4, 5, 6)
- There are two or more potential categories.
- The probability of each category is between 0 and 1.
- The sum of the probabilities add up to 1: 1/6 + 1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 1.
Rule of Thumb:
Â
If you can count the number of outcomes, then you are working with a discrete random variable – e.g. counting the number of times a coin lands on heads.
Â
But if you can measure the outcome, you are working with a continuous random variable – e.g. measuring height, weight, time, etc.
Other Examples of Categorical Distributions
There are plenty of categorical distributions in the real world, including:
Example 1: Flipping a Coin.
When we flip a coin there are 2 potential discrete outcomes, the probability of each outcome is between 0 and 1, and the sum of the probabilities is equal to 1:
Example 2: Selecting Marbles from an Urn.
Suppose an urn contains 5 red marbles, 3 green marbles, and 2 purple marbles. If we randomly select one marble from the urn, there are 3 potential discrete outcomes, the probability of each outcome is between 0 and 1, and the sum of the probabilities is equal to 1:
Example 3: Selecting a Card from a Deck.
If we randomly select a card from a standard 52-card deck, there are 13 potential discrete outcomes, the probability of each outcome is between 0 and 1, and the sum of the probabilities is equal to 1:
Relation to Other Distributions
For a distribution to be classified as a categorical distribution, it must have K ≥ 2 potential outcomes and n = 1 trial.
Using this terminology, a categorical distribution is similar to the following distributions:
Bernoulli distribution: K = 2 outcomes, n = 1 trial
Binomial distribution: K = 2 outcomes, n ≥ 1 trial
Multinomial distribution: K ≥ 2 outcomes, n ≥ trial
Additional Resources
What Are Random Variables?
An Introduction to the Binomial Distribution
An Introduction to the Multinomial Distribution