Creates a k-means model. Note: Trained k-means model does not have any parameters.

This operation is ported from Spark ML.

For a comprehensive introduction, see Spark documentation.

For scala docs details, see org.apache.spark.ml.clustering.KMeans documentation.

Since: Seahorse 1.0.0


This operation does not take any input.


Port Type Qualifier Description
0EstimatorAn Estimator that can be used in a Fit operation.


Name Type Description
k Numeric The number of clusters to create.
max iterations Numeric The maximum number of iterations.
seed Numeric The random seed.
tolerance Numeric The convergence tolerance for iterative algorithms.
init mode SingleChoice The initialization algorithm mode. This can be either "random" to choose random points as initial cluster centers, or "k-means||" to use a parallel variant of k-means++. Possible values: ["random", "k-means||"]
init steps Numeric The number of steps for the k-means|| initialization mode. It will be ignored when other initialization modes are chosen.
features column SingleColumnSelector The features column for model fitting.
prediction column String The prediction column created during model scoring.