Cluster analysis

Cluster analysis or clustering is a way of comparing data by splitting it into groups of similar data points. These groups are called clusters.

There are many algorithms to put data into clusters. Clustering algorithms can use different ways of measuring similarity between data points.^[1] As a result, different clustering algorithms can get different clusters on the same data.

References

↑ Estivill-Castro, Vladimir (June 2002). "Why so many clustering algorithms: a position paper". ACM SIGKDD Explorations Newsletter. 4 (1): 65–75. doi:10.1145/568574.568575.

Ezugwu, Absalom E.; Ikotun, Abiodun M.; Oyelade, Olaide O.; Abualigah, Laith; Agushaka, Jeffery O.; Eke, Christopher I.; Akinyelu, Andronicus A. (1 April 2022). "A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects". Engineering Applications of Artificial Intelligence. 110: 104743. doi:10.1016/j.engappai.2022.104743.

This short article can be made longer. You can help Wikipedia by adding to it.

[1] Estivill-Castro, Vladimir (June 2002). "Why so many clustering algorithms: a position paper". ACM SIGKDD Explorations Newsletter. 4 (1): 65–75. doi:10.1145/568574.568575.

[1]

v t e Machine learning evaluation metrics
Regression	MSE MAE sMAPE MAPE MASE MSPE RMS RMSE/RMSD R2 MDA MAD
Classification	F-score P4 Accuracy Precision Recall Kappa MCC AUC ROC Sensitivity and specificity Logarithmic Loss
Clustering	Silhouette Calinski-Harabasz Davies-Bouldin Dunn index Hopkins statistic Jaccard index Rand index Similarity measure SMC SimHash
Ranking	MRR DCG NDCG AP
Computer Vision	PSNR SSIM IoU
NLP	Perplexity BLEU
Deep Learning Related Metrics	Inception score FID
Recommender system	Coverage Intra-list Similarity
Similarity	Cosine similarity Euclidean distance Pearson correlation coefficient
Confusion matrix

Cluster analysis

References

Further reading