*90*

You can use the following methods to sort the rows of a pandas DataFrame based on the values in a particular string column:

**Method 1: Sort by String Column (when column only contains characters)**

df = df.sort_values('my_string_column')

**Method 2: Sort by String Column (when column contains characters and digits)**

#create 'sort' column that contains digits from 'my_string_column' df['sort'] = df['my_string_column'].str.extract('(d+)', expand=False).astype(int) #sort rows based on digits in 'sort' column df = df.sort_values('sort')

The following examples show how to use each method in practice.

**Example 1: Sort by String Column (when column only contains characters)**

Suppose we have the following pandas DataFrame that contains information about the sales of various products at some grocery store:

import pandas as pd #create DataFrame df = pd.DataFrame({'product': ['Apples', 'Oranges', 'Bananas', 'Lettuce', 'Beans'], 'sales': [18, 22, 19, 14, 29]}) #view DataFrame print(df) product sales 0 Apples 18 1 Oranges 22 2 Bananas 19 3 Lettuce 14 4 Beans 29

We can use the following syntax to sort the rows of the DataFrame based on the strings in the **product** column:

#sort rows from A to Z based on string in 'product' column df = df.sort_values('product') #view updated DataFrame print(df) product sales 0 Apples 18 2 Bananas 19 4 Beans 29 3 Lettuce 14 1 Oranges 22

Notice that the rows are now sorted from A to Z based on the strings in the **product** column.

If youâ€™d like to instead sort from Z to A, simply add the argument **ascending=False**:

#sort rows from Z to A based on string in 'product' column df = df.sort_values('product', ascending=False) #view updated DataFrame print(df) product sales 1 Oranges 22 3 Lettuce 14 4 Beans 29 2 Bananas 19 0 Apples 18

Notice that the rows are now sorted from Z to A based on the strings in the **product** column.

**Example 2: Sort by String Column (when column contains characters ***and* digits)

*and*digits)

Suppose we have the following pandas DataFrame that contains information about the sales of various products at some grocery store:

import pandas as pd #create DataFrame df = pd.DataFrame({'product': ['A3', 'A5', 'A22', 'A50', 'A2', 'A7', 'A9', 'A13'], 'sales': [18, 22, 19, 14, 14, 11, 20, 28]}) #view DataFrame print(df) product sales 0 A3 18 1 A5 22 2 A22 19 3 A50 14 4 A2 14 5 A7 11 6 A9 20 7 A13 28

Notice that the strings in the **product** column contain both characters and digits.

If we attempt to sort the rows of the DataFrame using the values in the **product** column, the strings will not be sorted in the correct order based on the digits:

import pandas as pd #sort rows based on strings in 'product' column df = df.sort_values('product') #view updated DataFrame print(df) product sales 7 A13 28 4 A2 14 2 A22 19 0 A3 18 1 A5 22 3 A50 14 5 A7 11 6 A9 20

Instead, we must create a new temporary column called **sort** that contains only the digits from the product column, then sort by the values in the **sort** column, then drop the column entirely:

import pandas as pd #create new 'sort' column that contains digits from 'product' column df['sort'] = df['product'].str.extract('(d+)', expand=False).astype(int) #sort rows based on digits in 'sort' column df = df.sort_values('sort') #drop 'sort' column df = df.drop('sort', axis=1) #view updated DataFrame print(df) product sales 4 A2 14 0 A3 18 1 A5 22 5 A7 11 6 A9 20 7 A13 28 2 A22 19 3 A50 14

Notice that the rows are now sorted by the strings in the **product** column and the digits are sorted in the correct order.

**Note**: You can find the complete documentation for the pandas **sort_values()** function here.

**Additional Resources**

The following tutorials explain how to perform other common tasks in pandas:

Pandas: How to Sort by Date

Pandas: How to Sort Columns by Name

Pandas: How to Sort by Both Index and Column