One error you may encounter in R is:
Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)
This error occurs when you attempt to perform k-means clustering in R but the data frame you’re using has one or more missing values.
This tutorial shares exactly how to fix this error.
How to Reproduce the Error
Suppose we have the following data frame in R with a missing value in the second row:
#create data frame df frame(var1=c(2, 4, 4, 6, 7, 8, 8, 9, 9, 12), var2=c(12, 14, 14, 8, 8, 15, 16, 9, 9, 11), var3=c(22, NA, 23, 24, 28, 23, 19, 16, 12, 15)) row.names(df) #view data frame df var1 var2 var3 A 2 12 22 B 4 14 NA C 4 14 23 D 6 8 24 E 7 8 28 F 8 15 23 G 8 16 19 H 9 9 16 I 9 9 12 J 12 11 15
If we attempt to use the kmeans() function to perform k-means clustering on this data frame, we’ll receive an error:
#attempt to perform k-means clustering with k = 3 clusters
km 3)
Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)
How to Fix the Error
The easiest way to fix this error is to simply use the na.omit() function to remove rows with missing values from the data frame:
#remove rows with NA values df omit(df) #perform k-means clustering with k = 3 clusters km #view results km K-means clustering with 3 clusters of sizes 4, 3, 2 Cluster means: var1 var2 var3 1 5.5 14.250000 21.75000 2 10.0 9.666667 14.33333 3 6.5 8.000000 26.00000 Clustering vector: A C D E F G H I J 1 1 3 3 1 1 2 2 2 Within cluster sum of squares by cluster: [1] 46.50000 17.33333 8.50000 (between_SS / total_SS = 79.5 %) Available components: [1] "cluster" "centers" "totss" "withinss" "tot.withinss" [6] "betweenss" "size" "iter" "ifault"
Notice that the k-means clustering algorithm runs successfully once we remove the rows with missing values from the data frame.
Bonus: A complete step-by-step guide to k-means clustering in R
Additional Resources
How to Fix in R: NAs Introduced by Coercion
How to Fix in R: Subscript out of bounds
How to Fix in R: longer object length is not a multiple of shorter object length