Perbandingan Metode KNN, Naive Bayes, dan Regresi Logistik Binomial dalam Pengklasifikasian Status Ekonomi Negara

N. K. Kutha Ardana, Ruhiyat Ruhiyat, Nurfatimah Amany, Teofilus Kevin Irawan, Raymond Raymond, Rizalius Karunia, Syifa Fauzia

Abstract


The classification of a country's economic status as developed or developing often involves factors such as life expectancy and its underlying variables. This research aims to compare the performance of three machine learning algorithms, namely KNN (K-Nearest Neighbors), naive Bayes, and binomial logistic regression, in classifying the economic status of countries as developed or developing. The data used in this study is "Life Expectancy (WHO) Fixed," obtained from the Kaggle website. The first statistical analysis conducted was Principal Component Analysis (PCA) using 16 predictor variables. PCA resulted in three principal components capable of explaining 71.41% of the variance, which were subsequently used in the KNN, naive Bayes, and binomial logistic regression methods. The analysis results from the KNN, naive Bayes, and binomial logistic regression methods produced F1-scores of 100\%, 98.19%, and 97.36%, respectively.

Keywords


Life Expectancy; Countries Classification; KNN; Naive Bayes; Binomial Logistic Regression

Full Text:

PDF

References


WHO, World health statistics 2022: monitoring health for the SDGs, sustainable development goals. Geneva: World Health Organization, 2022, [Online] Available: https://pesquisa.bvsalud.org/portal/resource/pt/who-356584.

R. Muda, R. A. Koleangan, and J. B. Kalangi, “Pengaruh angka harapan hidup, tingkat pendidikan dan pengeluaran perkapita terhadap pertumbuhan ekonomi di sulawesi utara pada tahun 2003-2017,” Jurnal Berkala Ilmiah Efisiensi, vol. 19, no. 1, pp. 44–55, 2019, [Online] Available: https://ejournal.unsrat.ac.id/index.php/jbie/article/view/22368.

A. Khan, S. Khan, and M. Khan, “Factors effecting life expectancy in developed and developing countries of the world (an approach to available literature),” International Journal of Yoga, Physiotherapy and Physical Education, vol. 1, no. 1, pp. 31–33, 2016.

T. Freeman, H. A. Gesesew, C. Bambra, E. R. J. Giugliani, J. Popay, D. Sanders, J. Macinko, C. Musolino, and F. Baum, “Why do some countries do better or worse in life expectancy relative to income? an analysis of brazil, Ethiopia, and the united states of America,” International Journal for Equity in Health, vol. 19, no. 1, p. 202, 2020, doi: 10.1186/s12939-020-01315-z.

G. Miladinov, “Socioeconomic development and life expectancy relationship: evidence from the eu accession candidate countries,” Genus, vol. 76, no. 1, pp. 1–20, 2020, doi: 10.1186/s41118-019-0071-0.

Y. A. Setianto, K. Kusrini, and H. Henderi, “Penerapan algoritma k-nearest neighbour dalam menentukan pembinaan koperasi kabupaten kotawaringin timur,” Creative Information Technology Journal, vol. 5, no. 3, pp. 232–241, 2019, doi: 10.24076/citec.2018v5i3.179.

K. Taunk, S. De, S. Verma, and A. Swetapadma, “A brief review of nearest neighbor algorithm for learning and classification,” in 2019 International Conference on Intelligent Computing and Control Systems (ICCS). IEEE, 2019, pp. 1255–1260, doi: 10.1109/ICCS45141.2019.9065747.

N. Bhatia and Vandana, “Survey of nearest neighbor technique,” International Journal of Computer Science and Information Security, vol. 8, no. 2, pp. 302–305, 2010.

G. James, D. Witten, T. Hastie, R. Tibshirani et al., An introduction to statistical learning, 2nd ed. New York: Springer, 2013, vol. 112.

I. Wickramasinghe and H. Kalutarage, “Naive bayes: applications, variations and vulnerabilities: a review of literature with code snippets for implementation,” Soft Computing, vol. 25, no. 3, pp. 2277–2293, 2021, doi: 10.1007/s00500-020-05297-6.

H. Muhamad, C. A. Prasojo, N. A. Sugianto, L. Surtiningsih, and I. Cholissodin, “Optimasi naıve bayes classifier dengan menggunakan particle swarm optimization pada data iris,” J. Teknol. Inf. dan Ilmu Komput, vol. 4, no. 3, pp. 180–184, 2017.

C. Zhang, D. Jia, L. Wang, W. Wang, F. Liu, and A. Yang, “Comparative research on network intrusion detection methods based on machine learning,” Computers & Security, vol. 121, p. 102861, 2022, doi: 10.1016/j.cose.2022.102861.

Y. Tampil, H. Komaliq, and Y. Langi, “Analisis regresi logistik untuk menentukan faktorfaktor yang mempengaruhi indeks prestasi kumulatif (ipk) mahasiswa fmipa universitas samratulangi manado,” d’CARTESIAN, vol. 6, no. 2, pp. 56–62, 2017, doi: 10.35799/dc.6.2.2017.17023.

S. Sperandei, “Understanding logistic regression analysis,” Biochemia Medica, vol. 24, no. 1, pp. 12–18, 2014, doi: 10.11613/BM.2014.003.

T. Abedin, Z. Chowdhury, A. Afzal, F. Yeasmin, and T. Turin, “Application of binary logistic regression in clinical research,” Journal of National Heart Foundation of Bangladesh, vol. 5, no. 1, pp. 8–11, 2016.

B. Everitt, G. Dunn et al., Applied multivariate data analysis, 2nd ed. London: Wiley Online Library, 2001, vol. 2.

I. T. Jolliffe, Principal Component Analysis, 2nd ed. New York: Springer-Verlag, 2002, doi: 10.1007/b98835.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed. Springer New York, 2009, doi: 10.1007/978-0-387-84858-7.

L. Farokhah, “Implementasi k-nearest neighbor untuk klasifikasi bunga dengan ekstraksi fitur warna rgb,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 7, no. 6, pp. 1129–1136, 2020, doi: 10.25126/jtiik.2020722608.

W. Wahyono, I. N. P. Trisna, S. L. Sariwening, M. Fajar, and D. Wijayanto, “Comparison of distance measurement on k-nearest neighbour in textual data classification,” Jurnal Teknologi dan Sistem Komputer, vol. 8, no. 1, pp. 54–58, 2020, doi: 10.14710/jtsiskom.8.1.2020.54-58.

A. W. Syaputri, E. Irwandi, and M. Mustakim, “Na¨ıve bayes algorithm for classification of student major’s specialization,” Journal of Intelligent Computing & Health Informatics, vol. 1, no. 1, pp. 17-21, 2020, doi: 10.26714/jichi.v1i1.5570.

M. Hasan, “Prediksi tingkat kelancaran pembayaran kredit bank menggunakan algoritma naıve bayes berbasis forward selection,” ILKOM Jurnal Ilmiah, vol. 9, no. 3, pp. 317–324, 2017, doi: 10.33096/ilkom.v9i3.163.317-324.

D. H. Ismunarti, “Regresi logistik binomial, model untuk toksisitas logam berat timbal pb terhadap larva udang vannamae,” Buletin Oseanografi Marina, vol. 1, no. 5, pp. 47–52, 2012.

M. K. Suryadewiansyah and T. E. E. Tju, “Na¨ıve bayes dan confusion matrix untuk efisiensi analisa intrusion detection system alert,” Jurnal Nasional Teknologi dan Sistem Informasi, vol. 8, no. 2, pp. 81–88, 2022, doi: 10.25077/TEKNOSI.v8i2.2022.81-88.

I. W. Saputro and B. W. Sari, “Uji performa algoritma naive bayes untuk prediksi masa studio mahasiswa,” Creative Information Technology Journal, vol. 6, no. 1, pp. 1–11, 2020, doi: 10.24076/citec.2019v6i1.178.




DOI: https://doi.org/10.34312/jjom.v5i2.21103



Copyright (c) 2023 N. K. Kutha Ardana, Ruhiyat Ruhiyat, Nurfatimah Amany, Teofilus Kevin Irawan, Raymond Raymond, Rizalius Karunia, Syifa Fauzia

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Jambura Journal of Mathematics has been indexed by

>>>More Indexing<<<


Creative Commons License

Jambura Journal of Mathematics (e-ISSN: 2656-1344) by Department of Mathematics Universitas Negeri Gorontalo is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Powered by Public Knowledge Project OJS. 


Editorial Office


Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Negeri Gorontalo
Jl. Prof. Dr. Ing. B. J. Habibie, Moutong, Tilongkabila, Kabupaten Bone Bolango, Gorontalo, Indonesia
Email: info.jjom@ung.ac.id.