*54*

A **frequency table **is a table that displays the frequencies of different categories. This type of table is particularly useful for understanding the distribution of values in a dataset.

This tutorial explains how to create frequency tables in Python.

**One-Way Frequency Table for a Series**

To find the frequencies of individual values in a pandas Series, you can use the **value_counts()** function:

import pandas as pd #define Series data = pd.Series([1, 1, 1, 2, 3, 3, 3, 3, 4, 4, 5]) #find frequencies of each value data.value_counts() 3 4 1 3 4 2 5 1 2 1

You can add the argument **sort=False **if you don’t want the data values sorted by frequency:

data.value_counts(sort=False) 1 3 2 1 3 4 4 2 5 1

The way to interpret the output is as follows:

- The value “1” occurs
**3**times in the Series. - The value “2” occurs
**1**time in the Series. - The value “3” occurs
**4**times in the Series.

And so on.

**One-Way Frequency Table for a ****DataFrame**

To find frequencies of a pandas DataFrame you can use the **crosstab****()** function, which uses the following sytax:

**crosstab(index, columns)**

where:

**index:**name of column to group by**columns:**name to give to frequency column

For example, suppose we have a DataFrame with information about the letter grade, age, and gender of 10 different students in a class. Here’s how to find the frequency for each letter grade:

#create data df = pd.DataFrame({'Grade': ['A','A','A','B','B', 'B', 'B', 'C', 'D', 'D'], 'Age': [18, 18, 18, 19, 19, 20, 18, 18, 19, 19], 'Gender': ['M','M', 'F', 'F', 'F', 'M', 'M', 'F', 'M', 'F']}) #view data df Grade Age Gender 0 A 18 M 1 A 18 M 2 A 18 F 3 B 19 F 4 B 19 F 5 B 20 M 6 B 18 M 7 C 18 F 8 D 19 M 9 D 19 F #find frequency of each letter grade pd.crosstab(index=df['Grade'], columns='count') col_0 count Grade A 3 B 4 C 1 D 2

The way to interpret this is as follows:

**3**students received an ‘A’ in the class.**4**students received a ‘B’ in the class.**1**student received a ‘C’ in the class.**2**students received a ‘D’ in the class.

We can use a similar syntax to find the frequency counts for other columns. For example, here’s how to find frequency by age:

pd.crosstab(index=df['Age'], columns='count') col_0 count Age 18 5 19 4 20 1

The way to interpret this is as follows:

**5**students are 18 years old.**4**students are 19 years old.**1**student is 20 years old.

You can also easily display the frequencies as proportions of the entire dataset by dividing by the sum:

#define crosstab tab = pd.crosstab(index=df['Age'], columns='count') #find proportions tab/tab.sum() col_0 count Age 18 0.5 19 0.4 20 0.1

The way to interpret this is as follows:

**50%**of students are 18 years old.**40%**of students are 19 years old.**10%**of students are 20 years old.

**Two-Way Frequency Tables for a DataFrame**

You can also create a two-way frequency table to display the frequencies for two different variables in the dataset. For example, here’s how to create a two-way frequency table for the variables Age and Grade:

pd.crosstab(index=df['Age'], columns=df['Grade']) Grade A B C D Age 18 3 1 1 0 19 0 2 0 2 20 0 1 0 0

The way to interpret this is as follows:

- There are
**3**students who are 18 years old and received an ‘A’ in the class. - There is
**1**student who is 18 years old and received a ‘B’ in the class. - There is
**1**student who is 18 years old and received a ‘C’ in the class. - There are
**0**students who are 18 years old and received a ‘D’ in the class.

And so on.

*You can find the complete documentation for the crosstab() function here.*