There are two common ways to plot the values from two columns in a pandas DataFrame:
Method 1: Plot Two Columns as Points on Scatter Plot
import matplotlib.pyplot as plt
plt.scatter(df['column1'], df['column2'])
Method 2: Plot Two Columns as Lines on Line Chart
df.plot(x='column1', y=['column2', 'column3'])
The following examples show how to use each method in practice.
Example 1: Plot Two Columns on Scatter Plot
Suppose we have the following pandas DataFrame that contains information about various basketball players:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
'points': [18, 22, 19, 14, 14, 11, 20, 28],
'assists': [5, 7, 7, 9, 12, 9, 9, 4]})
#view DataFrame
print(df)
team points assists
0 A 18 5
1 B 22 7
2 C 19 7
3 D 14 9
4 E 14 12
5 F 11 9
6 G 20 9
7 H 28 4
We can use the following code to create a scatter plot that displays the points column on the x-axis and the assists column on the y-axis:
import matplotlib.pyplot as plt
#create scatter plot
plt.scatter(df['points'], df['assists'])
#add axis labels
plt.xlabel('Points')
plt.ylabel('Assists')
The x-axis contains the values from the points column and the y-axis contains the values from the assists column.
Example 2: Plot Two Columns on Line Chart
Suppose we have the following pandas DataFrame that contains information about points scored and points allowed by a basketball team in six different games:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'game': [1, 2, 3, 4, 5, 6],
'points_for': [99, 94, 92, 90, 87, 85],
'points_against': [89, 76, 78, 78, 85, 87]})
#view DataFrame
print(df)
game points_for points_against
0 1 99 89
1 2 94 76
2 3 92 78
3 4 90 78
4 5 87 85
5 6 85 87
We can use the following code to create a line chart that displays the values for point_for on one line and points_against on another line while using the value for game on the x-axis:
#plot points_for and points_against columns on same y-axis
df.plot(x='game', y=['points_for', 'points_against'])
The blue line represents the value for the points_for column in each game and the orange line represents the values for the points_against column in each game.
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
How to Use Groupby and Plot in Pandas
How to Plot Distribution of Column Values in Pandas
How to Adjust the Figure Size of a Pandas Plot