Home » How to Calculate Levenshtein Distance in Python

How to Calculate Levenshtein Distance in Python

by Tutor Aspire

The Levenshtein distance between two strings is the minimum number of single-character edits required to turn one word into the other.

The word “edits” includes substitutions, insertions, and deletions.

For example, suppose we have the following two words:

  • PARTY
  • PARK

The Levenshtein distance between the two words (i.e. the number of edits we have to make to turn one word into the other) would be 2:

Levenshtein distance example

In practice, the Levenshtein distance is used in many different applications including approximate string matching, spell-checking, and natural language processing.

This tutorial explains how to calculate the Levenshtein distance between strings in Python by using the python-Levenshtein module.

You can use the following syntax to install this module:

pip install python-Levenshtein

You can then load the function to calculate the Levenshtein distance:

from Levenshtein import distance as lev

The following examples show how to use this function in practice.

Example 1: Levenshtein Distance Between Two Strings

The following code shows how to calculate the Levenshtein distance between the two strings “party” and “park”:

#calculate Levenshtein distance
lev('party', 'park')

2

The Levenshtein distance turns out to be 2.

Example 2: Levenshtein Distance Between Two Arrays

The following code shows how to calculate the Levenshtein distance between every pairwise combination of strings in two different arrays:

#define arrays
a = ['Mavs', 'Spurs', 'Lakers', 'Cavs']
b #calculate Levenshtein distance between two arrays
for i,k in zip(a, b):
  print(lev(i, k))

6
4
5
5

The way to interpret the output is as follows:

  • The Levenshtein distance between ‘Mavs’ and ‘Rockets’ is 6.
  • The Levenshtein distance between ‘Spurs’ and ‘Pacers’ is 4.
  • The Levenshtein distance between ‘Lakers’ and ‘Warriors’ is 5.
  • The Levenshtein distance between ‘Cavs’ and ‘Celtics’ is 5.

Additional Resources

How to Calculate Hamming Distance in Python
How to Calculate Euclidean Distance in Python
How to Calculate Mahalanobis Distance in Python

You may also like