### Penerapan Principal Component Analysis untuk Reduksi Variabel pada Algoritma K-Means Clustering

#### Abstract

#### Keywords

#### Full Text:

PDF#### References

L. Zhang, “A feature selection algorithm integrating maximum classification information and minimum interaction feature dependency information,” Computational Intelligence and Neuroscience, 2021.

J. Shlens, “A tutorial on principal component analysis,” http://arxiv.org/abs/1404.1100, 2014.

I. Jolliffe, “Principal components analysis,” Wiley StatsRef: Statistics Reference Online, 2014.

Z. John Lu, “The elements of statistical learning: data mining, inference, and prediction,” Journal of the Royal Statistical Society Series A: Statistics in Society, vol. 173, no. 3, 2010.

A. Deshpande and K. Varadarajan, “Sampling-based dimension reduction for subspace approximation,” in Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, 2007. doi: 10.1145/1250790.1250884 pp. 641–650.

J. Wang, C. Xia, Y. Wu, X. Tian, K. Zhang, and Z. Wang, “Rapid detection of carbapenem-resistant klebsiella pneumoniae using machine learning and maldi-tof ms platform,” Infection and Drug Resistance, vol. 15, pp. 3703–3710, 2022.

J. Yang, Y. K. Wang, X. Yao, and C. T. Lin, “Adaptive initialization method for k-means algorithm,” Frontiers in Artificial Intelligence, vol. 4, 2021.

A. Deshpande and K. Varadarajan, “Sampling-based dimension reduction for subspace approximation,” in Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, 2007. doi: 10.1145/1250790.1250884 pp. 641–650.

J. Duo, P. Zhang, and L. Hao, “A k-means text clustering algorithm based on subject feature vector,” Journal of Web Engineering, vol. 20, no. 6, pp. 1935– 1946, 2021.

K. Katahira, “Evaluating the predictive performance of subtyping: A criterion for cluster mean-based prediction,” Statistics in Medicine, vol. 42, no. 7, pp. 1045–1065, 2023.

R. Lakshmi and S. Baskar, “Dic-doc-k-means: Dissimilarity-based initial centroid selection for document clustering using k-means for improving the effectiveness of text document clustering,” Journal of Information Science, vol. 45, no. 6, pp. 818–832, 2019.

K. Shanthi and D. S. .M, “Performance analysis of improved k-means & kmeans in cluster generation,” International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, vol. 3, no. 9, pp. 11 878– 11 884, 2014.

J. Yang, Y.-K. Wang, X. Yao, and C.-T. Lin, “Adaptive initialization method for k-means algorithm,” Frontiers in Artificial Intelligence, vol. 4, 2021.

A. L. Yusniyanti, F. Virgantari, and Y. E. Faridhan, “Comparison of average linkage and k-means methods in clustering indonesia’s provinces based on welfare indicators,” Journal of Physics: Conference Series, vol. 1863, no. 1, 2021.

BPS Jabar, Badan Pusat Statistik Provinsi Jawa Barat. Provinsi Jawa Barat, 2021.

I. T. Jolliffe, Principal Component Analysis. Springer Science & Business Media, 2013.

R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, 6th ed. Prentice Education, Inc., 2007.

J. F. Hair, R. E. Anderson, R. L. Tatham, and W. C. Black, Multivariate Data Analysis 5th Edition, 5th ed. Prentice-Hall, Inc., 1998.

D. C. Montgomery, E. A. Peck, and G. G. Vining, Introduction to Linear Regression Analysis Solutions Manual to Accompany. John Wiley & Sons, 2013.

B. Everitt and T. Hothorn, An introduction to applied multivariate analysis with R. Springer Science & Business Media, 2011.

C. L. Clayman, S. M. Srinivasan, and R. S. Sangwan, “K-means clustering and principal components analysis of microarray data of l1000 landmark genes,”Procedia Computer Science, vol. 168, pp. 97–104, 2020.

T. M. Kodinariva and P. R. Makwana, “Review on determining number of cluster in k-means clustering,” International Journal of Advance Research in Computer Science and Management Studies, vol. 1, no. 6, 2013.

D. A. I. C. Dewi and D. A. K. Pramita, “Analisis perbandingan metode elbow dan silhouette pada algoritma clustering k-medoids dalam pengelompokan produk kerajinan bali,” MATRIX: Jurnal Manajemen Teknologi dan Informatika, vol. 9, no. 3, 2019.

R. Tibshirani, G. Walther, and T. Hastie, “Estimating the number of clusters in a data set via the gap statistic,” Journal of the Royal Statistical Society, vol. 63, no. 2, pp. 411–423, 2001.

R. Silvi, “Analisis cluster dengan data outlier menggunakan centroid linkage dan k-means clustering untuk pengelompokan indikator hiv/aids di indonesia,” Jurnal Matematika MANTIK, vol. 4, no. 1, pp. 22–31, 2018.

M. Charrad, N. Ghazzali, V. Boiteau, and A. Niknafs, “Nbclust: An r package for determining the relevant number of clusters in a data set,” Journal of Statistical Software, vol. 61, no. 6, pp. 1–36, 2014.

R. O. Duda, P. E. Hart, and D. G. Stork, Pattern classification and scene analysis. Wiley New York, 1973, vol. 3.

L. J. Hubert and J. R. Levin, “A general statistical framework for assessing categorical clustering in free recall,” Psychological Bulletin, vol. 83, no. 6, pp. 1072–1080, 1976.

L. A. Goodman, W. H. Kruskal, L. A. Goodman, and W. H. Kruskal, Measures of association for cross classifications. Springer, 1979.

E. M. L. Beale, Cluster analysis. Scientific Control Systems, 1969.

G. W. Milligan and M. C. Cooper, “An examination of procedures for determining the number of clusters in a data set,” Psychometrika, vol. 50, no. 2, pp. 159–179, 1985.

DOI: https://doi.org/10.37905/jjps.v5i1.18733

### Refbacks

- There are currently no refbacks.

Copyright (c) 2024 Jambura Journal of Probability and Statistics

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

**Editorial Office of Jambura Journal of Probability and Statistics:**

^{rd }Floor Faculty of Mathematics and Natural Sciences, Universitas Negeri Gorontalo