Synthetic Minority Oversampling Technique Pada Model Logit dan Probit Status Pengangguran Terdidik

Fatimah Fatimah, Anwar Fitrianto, Indahwati Indahwati, Erfiani Erfiani, Khusnia Nurul Khikmah


Educated unemployment is caused by a misalignment of educational development planning and employment development, resulting in underemployed graduates from various educational institutions. Unemployment data in DKI Jakarta shows an unequal class. Unbalanced data is a severe problem of modeling because it can cause prediction errors that affect the accuracy of the resulting model. Using SMOTE to handle unbalanced data will likely increase the model’s accuracy. This study aims to find the best model for identifying the factors influencing the status of educated unemployment using logit and probit models and handling unbalanced data using SMOTE. The results showed that the independent variables that affect the status of educated unemployment in the logit and probit models are the same: age group and participation in training. The independent variables that affect the status of educated unemployment in the logit and probit models with SMOTE are also the same: age group, marital status, and participation in training. Unbalanced data handling using SMOTE can increase the balanced accuracy value significantly. Balanced accuracy values for the logit and probit models with SMOTE are higher than the logit and probit models without SMOTE. The logit model with SMOTE is the best because it has the highest balanced accuracy value compared to other models. According to the logit model with SMOTE, the educated unemployed in DKI Jakarta are young and have never married. There is a need for the government to play a role in improving the quality of educational institutions in producing graduates who meet company qualifications and can be hired by employers. Unemployed people who have attended the training, despite having a higher education, may also become unemployed. The training provided has not been able to reduce the unemployment rate. As a result, the government should be able to provide training to improve entrepreneurship skills while also providing capital in the form of business loans to reduce educated unemployment.


SMOTE; Logit; Probit; Educated Unemployment

Full Text:



M. Kassem, A. Ali, and M. Audi, “Unemployment Rate, Population Density and Crime Rate in Punjab (Pakistan): An Empirical Analysis,” Bulletin of Business and Economics, vol. 8, no. 2, pp. 92–104, 2019.

BPS (Badan Pusat Statistik), Keadaan Angkatan Kerja di Indonesia Agustus 2021. Jakarta: Badan Pusat Statistik, 2021.

BPS (Badan Pusat Statistik), Ringkasan Eksekutif Informasi Ketenagakerjaan Provinsi Sumatera Barat 2015. Padang: Badan Pusat Statistik Provinsi Sumatera Barat, 2016.

R. Rosalina, P. H. Prihanto, and E. Achmad, “Faktor-faktor yang mempengaruhi tingkat pengangguran terdidik di Provinsi Jambi,” e-Jurnal Ekonomi Sumberdaya dan Lingkungan, vol. 6, no. 3, pp. 123-133, 2017.

M. M. Huda, I. W. Subagiarta, and M. Adenan, “Determinan Pengangguran Terdidik Jawa Timur,” e-Journal Ekonomi Bisnis dan Akuntansi, vol. 5, no. 1, pp. 48–52, may 2018, doi: 10.19184/ejeba.v5i1.7733.

F. A. Alharis and A. F. Yuniasih, “Determinan Pengangguran Usia Muda Terdidik di Provinsi Banten Tahun 2020,” in Seminar Nasional Official Statistics, vol. 2022, no. 1, nov 2022, pp. 53–62, doi: 10.34123/semnasoffstat.v2022i1.1153.

M. F. Aulia, “Determinan Pengangguran Terdidik di Jawa Timur,” Jurnal Ilmiah Mahasiswa FEB Universitas Brawijaya, vol. 5, no. 2, 2017.

N. R. Aulia and L. Yuliana, “Determinan Pengangguran Terdidik di Wilayah Perkotaan Perdesaan dan Wilayah Perkotaan Provinsi Kepulauan Riau Tahun 2021,” Seminar Nasional Official Statistics, vol. 2022, no. 1, pp. 275–284, nov 2022, doi: 10.34123/semnasoffstat.v2022i1.1367.

A. Agresti, An Introduction to Categorical Data Analysis, 3rd ed. New Jersey: John Wiley & Sons, Inc., 2018.

C. Salas-Eljatib, A. Fuentes-Ramirez, T. G. Gregoire, A. Altamirano, and V. Yaitul, “A study on the effects of unbalanced data when fitting logistic regression models in ecology,” Ecological Indicators, vol. 85, pp. 502–508, feb 2018, doi: 10.1016/j.ecolind.2017.10.030.

S. Garc´ıa, J. Luengo, and F. Herrera, “Tutorial on practical tips of the most influential data preprocessing algorithms in data mining,” Knowledge-Based Systems, vol. 98, pp. 1–29, apr 2016, doi: 10.1016/j.knosys.2015.12.006.

A. Ishaq, S. Sadiq, M. Umer, S. Ullah, S. Mirjalili, V. Rupapara, and M. Nappi, “Improving the Prediction of Heart Failure Patients’ Survival Using SMOTE and Effective Data Mining Techniques,” IEEE Access, vol. 9, pp. 39 707–39 716, 2021, doi: 10.1109/ACCESS.2021.3064084.

D. Forsyth, Probability and Statistics for Computer Science. Cham: Springer International Publishing, 2018, doi: 10.1007/978-3-319-64410-3.

V. R. Joseph, “Optimal ratio for data splitting,” Statistical Analysis and Data Mining: The ASA Data Science Journal, vol. 15, no. 4, pp. 531–538, aug 2022, doi: 10.1002/sam.11583.

F. F. Adyaksa, “Analisis Faktor-Faktor yang Mempengaruhi Pengangguran Terdidik di Indonesia Tahun 2018,” E-Jurnal Ilmu Ekonomi dan Bisnis Universitas Brawijaya, vol. 8, no. 2, pp. 1–10, 2019.

M. V. Makung, R. Hadi, Y. Rosaripatria, and S. I. Oktora, “Determinan Pengangguran Terdidik Di Provinsi Nusa Tenggara Timur (NTT) Tahun 2018 Menggunakan Regresi Logistik Bine,” Jurnal Statistika Universitas Muhammadiyah Semarang, vol. 9, no. 2, pp. 64–78, dec 2021, doi: 10.26714/jsunimus.9.2.2021.64-78.

T. Beysolow II, Introduction to Deep Learning Using R. Berkeley, CA: Apress, 2017, doi: 10.1007/978-1-4842-2734-3.

B. Krawczyk, “Learning from imbalanced data: open challenges and future directions,” Progress in Artificial Intelligence, vol. 5, no. 4, pp. 221–232, nov 2016, doi: 10.1007/s13748-016-0094-0.

X. Chao, G. Kou, Y. Peng, and A. Fernandez, “An efficiency curve for evaluating imbalanced ´classifiers considering intrinsic data characteristics: Experimental analysis,” Information Sciences, vol. 608, pp. 1131–1156, aug 2022, doi: 10.1016/j.ins.2022.06.045.

Z. Zhang, H. Liu, D. Chen, J. Zhang, H. Li, M. Shen, Y. Pu, Z. Zhang, J. Zhao, and J. Hu, “SMOTE-based method for balanced spectral nondestructive detection of moldy apple core,” Food Control, vol. 141, p. 109100, nov 2022, doi: 10.1016/j.foodcont.2022.109100.

A. Luque, A. Carrasco, A. Mart´ın, and A. de las Heras, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognition, vol. 91, pp. 216–231, jul 2019, doi: 10.1016/j.patcog.2019.02.023.

N. Ismi, Efektifitas Balai Latihan Kerja dalam Mengurangi Pengangguran di Kabupaten Bone. Skripsi: Universitas Muhammadiyah Makassar, 2020.

M. N. Pratama, N. Widowati, and M. Maesaroh, “Efektivitas Program Pelatihan Kerja UPTD Balai Latihan Kerja Dinas Tenaga Kerja Kota Semarang,” Journal of Public Policy and Management Review, vol. 10, no. 2, pp. 104–116, 2021, doi: 10.14710/jppmr.v10i2.30593.


Copyright (c) 2023 Fatimah, Anwar Fitrianto, Indahwati, Erfiani, Khusnia Nurul Khikmah

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Jambura Journal of Mathematics has been indexed by

>>>More Indexing<<<

Creative Commons License

Jambura Journal of Mathematics (e-ISSN: 2656-1344) by Department of Mathematics Universitas Negeri Gorontalo is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Powered by Public Knowledge Project OJS. 

Editorial Office

Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Negeri Gorontalo
Jl. Prof. Dr. Ing. B. J. Habibie, Moutong, Tilongkabila, Kabupaten Bone Bolango, Gorontalo, Indonesia