Link Search Menu Expand Document Documentation Menu

kmeans (Deprecated)

The kmeans command is deprecated in favor of the ml command.

The kmeans command applies the k-means algorithm in the ML Commons plugin on the search results returned by a PPL command.

To use the kmeans command, plugins.calcite.enabled must be set to false.

Syntax

The kmeans command has the following syntax:

kmeans <centroids> <iterations> <distance_type>

Parameters

The kmeans command supports the following parameters.

Parameter Required/Optional Description
<centroids> Optional The number of clusters to group data points into. Default is 2.
<iterations> Optional The number of iterations. Default is 10.
<distance_type> Optional The distance type. Valid values are COSINE, L1, and EUCLIDEAN. Default is EUCLIDEAN.

Example: Clustering of the Iris dataset

The following query classifies three Iris species (Iris setosa, Iris virginica, and Iris versicolor) based on the combination of four features measured from each sample (the lengths and widths of sepals and petals):

source=iris_data
| fields sepal_length_in_cm, sepal_width_in_cm, petal_length_in_cm, petal_width_in_cm
| kmeans centroids=3

The query returns the following results:

sepal_length_in_cm sepal_width_in_cm petal_length_in_cm petal_width_in_cm ClusterID
5.1 3.5 1.4 0.2 1
5.6 3.0 4.1 1.3 0
6.7 2.5 5.8 1.8 2
350 characters left

Have a question? .

Want to contribute? or .