Comparison of Random Forest, XGBoost, and LightGBM Methods for the Human Development Index Classification

Yunna Mentari Indah, Rafika Aristawidya, Anwar Fitrianto, Erfiani Erfiani, L.M. Risman Dwi Jumansyah

Abstract


Machine learning classification is an effective tool for categorizing data based on patterns, which is particularly useful in analyzing the Human Development Index (HDI) in Indonesia. HDI serves as a key indicator of regional development progress, making it crucial to classify HDI categories at the regency/city level to support targeted development planning. This study aims to compare the performance of three ensemble-based classification methods—Random Forest, XGBoost, and LightGBM—in classifying HDI categories in Indonesia. Data from the Central Bureau of Statistics (BPS) in 2023, comprising 514 observations across nine variables, was used for analysis. The study applied these algorithms to analyze the most influential variables affecting HDI. The results show that LightGBM outperformed both Random Forest and XGBoost, achieving an accuracy of 0.937 without outlier handling and 0.944 with outlier handling. Additionally, per capita expenditure was identified as the most influential factor in predicting HDI. These findings contribute to the field of statistical modeling by demonstrating how ensemble methods can improve classification accuracy and provide valuable insights for data-driven policymaking, thus enhancing regional development planning and supporting future HDI-related research.

Keywords


Classification; Random Forest; XGBoost; LightGBM; Human Development Index

Full Text:

PDF

References


BPS, “Indeks Pembangunan Manusia 2023.,†Jakarta: BPS, 2023.

G. Alfian et al., “Improving efficiency of RFID-based traceability system for perishable food by utilizing IoT sensors and machine learning model,†Food Control, vol. 110, p. 107016, 2020, doi: 10.1016/j.foodcont.2019.107016.

M. J. Paput, K. Suryowati, and M. T. Jatipaningrum, “Perbandingan Metode Random Forest dan Adaptive Boosting pada Klasifikasi Indeks Pembangunan Manusia di Indonesia,†Jurnal Statistika Industri Dan Komputasi, vol. 8, no. 2, pp. 73–83, 2023, doi: 10.34151/statistika.v8i2.4458.

P. N. Tan, M. Steinbach, and V. Kumar. “Introduction to data miningâ€, ed. Addison-Wesley Longman Publishing Co., Inc., 2005.

J. L. Speiser, M. E. Miller, J. Tooze, and E. Ip. “A comparison of random forest variable selection methods for classification prediction modelingâ€, Expert systems with applications, 134, pp.9 3-101, 2019, doi: 10.1016/j.eswa.2019.05.028.

S. Mahmuda, D. A. Nohe, and A. M. Leonardo, “Classification of the human development index in Kalimantan using random forest method,†in Proceeding International Seminar of Science and Technology, pp. 231–239, 2024, doi: 10.33830/isst.v3i1.2283.

I. Syarif, E. Zaluska, A. Prugel-Bennett, and G. Wills, “Application of bagging, boosting and stacking to intrusion detection,†in Machine Learning and Data Mining in Pattern Recognition: 8th International Conference, MLDM 2012, Berlin, Germany, July 13-20, 2012. Proceedings 8, Springer, pp. 593–602, 2012, doi: 10.1007/978-3-642-31537-4_46.

G. Airlangga, “Comparative Analysis of Machine Learning Models for Predicting Diabetes: Unveiling the Superiority of Advanced Ensemble Methods,†G-Tech: Jurnal Teknologi Terapan, vol. 8, no. 2, pp. 1272–1280, 2024, doi: 10.33379/gtech.v8i2.4246.

I. Wardhana, M. Ariawijaya, V. A. Isnaini, and R. P. Wirman, “Gradient Boosting Machine, Random Forest dan Light GBM untuk Klasifikasi Kacang Kering,†Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 1, pp. 92–99, 2022, doi: 10.29207/resti.v6i1.3682.

L. Breiman, “Random forests“, Machine learning, 45, 2001, pp.5-32.

C. Yoo, D. Han, J. Im, and B. Bechtel, “Comparison between convolutional neural networks and random forest for local climate zone classification in mega urban areas using Landsat images,†ISPRS Journal of Photogrammetry and Remote Sensing, vol. 157, pp. 155–170, 2019, doi: 10.1016/j.isprsjprs.2019.09.009.

B. S. Wardani, S. Sa’adah, and D. Nurjanah, “Measuring and Mitigating Bias in Bank Customers Data with XGBoost, LightGBM, and Random Forest Algorithm,†Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), vol. 9, no. 1, pp. 142–155, 2023, doi: 10.26555/jiteki.v9i1.25768.

Y. Jiang, G. Tong, H. Yin, and N. Xiong, “A pedestrian detection method based on genetic algorithm for optimize XGBoost training parameters,†IEEE Access, vol. 7, pp. 118310–118321, 2019, doi: 10.1109/ACCESS.2019.2936454.

T. Chen, and C. Guestrin, “Xgboost: A scalable tree boosting systemâ€, In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 785-794, 2016, doi: 10.1145/2939672.2939785.

J. Fan et al., “Comparison of Support Vector Machine and Extreme Gradient Boosting for predicting daily global solar radiation using temperature and precipitation in humid subtropical climates: A case study in China,†Energy Convers Manag, vol. 164, pp. 102–111, 2018, doi: 10.1016/j.enconman.2018.02.087.

S. Liang, “Comparative Analysis of SVM, XGBoost and Neural Network on Hate Speech Classification,†Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), vol. 5, no. 5, pp. 896–903, 2021, doi: 10.29207/resti.v5i5.3506.

R. Latifah and G. Erda, “Application Of The Lightgbm Algorithm In The Classification Of Greenhouse Gas Emissions,†Parameter: Journal of Statistics, vol. 4, no. 1, pp. 9–15, 2024, doi: 10.22487/27765660.2024.v4.i1.17055.




DOI: https://doi.org/10.37905/jjom.v7i1.28290



Copyright (c) 2025 Yunna Mentari Indah, Rafika Aristawidya, Anwar Fitrianto, Erfiani Erfiani, L.M. Risman Dwi Jumansyah

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Jambura Journal of Mathematics has been indexed by

>>>More Indexing<<<


Creative Commons License

Jambura Journal of Mathematics (e-ISSN: 2656-1344) by Department of Mathematics Universitas Negeri Gorontalo is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Powered by Public Knowledge Project OJS. 


Editorial Office


Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Negeri Gorontalo
Jl. Prof. Dr. Ing. B. J. Habibie, Moutong, Tilongkabila, Kabupaten Bone Bolango, Gorontalo, Indonesia
Email: info.jjom@ung.ac.id.


Â