Home » How to Combine Two Columns in Pandas (With Examples)

How to Combine Two Columns in Pandas (With Examples)

by Tutor Aspire

You can use the following syntax to combine two text columns into one in a pandas DataFrame:

df['new_column'] = df['column1'] + df['column2']

If one of the columns isn’t already a string, you can convert it using the astype(str) command:

df['new_column'] = df['column1'].astype(str) + df['column2']

And you can use the following syntax to combine multiple text columns into one:

df['new_column'] = df[['col1', 'col2', 'col3', ...]].agg(' '.join, axis=1) 

The following examples show how to combine text columns in practice.

Example 1: Combine Two Columns

The following code shows how to combine two text columns into one in a pandas DataFrame:

import pandas as pd

#create dataFrame
df = pd.DataFrame({'team': ['Mavs', 'Lakers', 'Spurs', 'Cavs'],
                   'first': ['Dirk', 'Kobe', 'Tim', 'Lebron'],
                   'last': ['Nowitzki', 'Bryant', 'Duncan', 'James'],
                   'points': [26, 31, 22, 29]})

#combine first and last name column into new column, with space in between 
df['full_name'] = df['first'] + ' ' + df['last']

#view resulting dataFrame
df

	team	first	last	 points	full_name
0	Mavs	Dirk	Nowitzki 26	Dirk Nowitzki
1	Lakers	Kobe	Bryant	 31	Kobe Bryant
2	Spurs	Tim	Duncan	 22	Tim Duncan
3	Cavs	Lebron	James	 29	Lebron James

We joined the first and last name column with a space in between, but we could also use a different separator such as a dash:

#combine first and last name column into new column, with dash in between 
df['full_name'] = df['first'] + '-' + df['last']

#view resulting dataFrame
df

	team	first	last	 points	full_name
0	Mavs	Dirk	Nowitzki 26	Dirk-Nowitzki
1	Lakers	Kobe	Bryant	 31	Kobe-Bryant
2	Spurs	Tim	Duncan	 22	Tim-Duncan
3	Cavs	Lebron	James	 29	Lebron-James

Example 2: Convert to Text & Combine Two Columns

The following code shows how to convert one column to text, then join it to another column:

import pandas as pd

#create dataFrame
df = pd.DataFrame({'team': ['Mavs', 'Lakers', 'Spurs', 'Cavs'],
                   'first': ['Dirk', 'Kobe', 'Tim', 'Lebron'],
                   'last': ['Nowitzki', 'Bryant', 'Duncan', 'James'],
                   'points': [26, 31, 22, 29]})

#convert points to text, then join to last name column 
df['name_points'] = df['last'] + df['points'].astype(str)

#view resulting dataFrame
df

        team	first	last	 points	name_points
0	Mavs	Dirk	Nowitzki 26	Nowitzki26
1	Lakers	Kobe	Bryant	 31	Bryant31
2	Spurs	Tim	Duncan	 22	Duncan22
3	Cavs	Lebron	James	 29	James29

Example 3: Combine More Than Two Columns

The following code shows how to join multiple columns into one column:

import pandas as pd

#create dataFrame
df = pd.DataFrame({'team': ['Mavs', 'Lakers', 'Spurs', 'Cavs'],
                   'first': ['Dirk', 'Kobe', 'Tim', 'Lebron'],
                   'last': ['Nowitzki', 'Bryant', 'Duncan', 'James'],
                   'points': [26, 31, 22, 29]})

#join team, first name, and last name into one column
df['team_and_name'] = df[['team', 'first', 'last']].agg(' '.join, axis=1)

#view resulting dataFrame
df

	team	first	last	 points	team_name
0	Mavs	Dirk	Nowitzki 26	Mavs Dirk Nowitzki
1	Lakers	Kobe	Bryant	 31	Lakers Kobe Bryant
2	Spurs	Tim	Duncan	 22	Spurs Tim Duncan
3	Cavs	Lebron	James	 29	Cavs Lebron James

Additional Resources

Pandas: How to Find the Difference Between Two Columns
Pandas: How to Find the Difference Between Two Rows
Pandas: How to Sort Columns by Name

You may also like