How to Fix: No module named ‘sklearn.cross_validation’

by Tutor Aspire January 17, 2023

One error you may encounter when using Python is:

ModuleNotFoundError: No module named 'sklearn.cross_validation'

This error usually occurs when you attempt to import the train_test_split function from sklearn using the following line:

from sklearn.cross_validation import train_test_split

However, the cross_validation sub-module has been replaced with the model_selection sub-module, so you need to use the following line instead:

from sklearn.model_selection import train_test_split

The following example shows how to resolve this error in practice.

How to Reproduce the Error

Suppose we would like to use the train_test_split function from sklearn to split a pandas DataFrame into training and testing sets.

Suppose we attempt to use the following code to import the train_test_split function:

from sklearn.cross_validation import train_test_split

ModuleNotFoundError: No module named 'sklearn.cross_validation'

We receive an error because we used the wrong sub-module name when attempting to import the train_test_split function.

How to Fix the Error

To fix this error, we simply need to use the model_selection sub-module instead:

from sklearn.model_selection import train_test_split

This time we don’t receive any error.

We could then proceed to use the train_test_split function to split a pandas DataFrame into a training and testing set:

from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np

#make this example reproducible
np.random.seed(1)

#create DataFrame with 1000 rows and 3 columns
df = pd.DataFrame({'x1': np.random.randint(30, size=1000),
                   'x2': np.random.randint(12, size=1000),
                   'y': np.random.randint(2, size=1000)})

#split original DataFrame into training and testing sets
train, test = train_test_split(df, test_size=0.2, random_state=0)

#view first few rows of each set
print(train.head())

     x1  x2  y
687  16   2  0
500  18   2  1
332   4  10  1
979   2   8  1
817  11   1  0

print(test.head())

     x1  x2  y
993  22   1  1
859  27   6  0
298  27   8  1
553  20   6  0
672   9   2  1

We’re successfully able to use the train_test_split function without any error.

Additional Resources

The following tutorials explain how to fix other common errors in Python:

How to Fix: columns overlap but no suffix specified
How to Fix: ‘numpy.ndarray’ object has no attribute ‘append’
How to Fix: if using all scalar values, you must pass an index
How to Fix: ValueError: cannot convert float NaN to integer

How to Fix: No module named ‘sklearn.cross_validation’

How to Reproduce the Error

How to Fix the Error

Additional Resources

How to Add Caption to ggplot2 Plots (3 Examples)

The Importance of Statistics in Nursing (With Examples)

You may also like