Ensemble Approach to Sentiment Analysis of Google Play Store App Reviews

Yasin Aril Mustofa, Irma Surya Kumala Idris

Abstract


In the current digital era, sentiment analysis of Google Play Store application reviews has become a critical key to understanding public opinion on technology products. This study aims to evaluate the effectiveness of ensemble approaches in sentiment analysis compared to individual classification algorithms. The methods employed include ensemble techniques such as Random Forest and Boosting, along with individual algorithms like Naive Bayes and Support Vector Machine (SVM). This research incorporates extensive preprocessing steps, including cleaning, case folding, tokenization, stopword removal, and normalization, to prepare the data before classification. The results demonstrate that ensemble models, particularly Random Forest, achieve superior performance in sentiment classification of app reviews, with accuracy reaching 94.15% for Zoom app reviews and 80.69% for Shopee app reviews. This performance confirms that ensemble approaches are more effective in handling the complexity and variability of review data compared to individually operated algorithms. The study provides valuable insights for application developers to enhance their products based on user feedback. However, there is still room for improvement in terms of optimizing algorithms for highly unbalanced data and developing methods that can handle more complex language nuances. Recommendations for future research include the use of Deep Learning techniques and cross-domain testing to assess the effectiveness of these models in various sentiment analysis settings.

Full Text:

PDF

References


I. S. K. Idris, Y. A. Mustofa, and I. A. Salihi, “Analisis Sentimen Terhadap Penggunaan Aplikasi Shopee Mengunakan Algoritma Support Vector Machine (SVM),” Jambura J. Electr. Electron. Eng., vol. 5, no. 1, pp. 32–35, 2023, doi: 10.37905/jjeee.v5i1.16830.

T. N. Wijaya, R. Indriati, and M. N. Muzaki, “Analisis Sentimen Opini Publik Tentang Undang-Undang Cipta Kerja Pada Twitter,” Jambura J. Electr. Electron. Eng., vol. 3, no. 2, pp. 78–83, 2021, doi: 10.37905/jjeee.v3i2.10885.

P. Yang, Y. H. Yang, B. B. Zhou, and A. Y. Zomaya, “A Review of Ensemble Methods in Bioinformatics,” Curr. Bioinform., 2010, doi: 10.2174/157489310794072508.

Y. B. Lasotte, E. J. Garba, Y. M. Malgwi, and M. A. Buhari, “An Ensemble Machine Learning Approach for Fake News Detection and Classification Using a Soft Voting Classifier,” Eur. J. Electr. Eng. Comput. Sci., 2022, doi: 10.24018/ejece.2022.6.2.409.

M. J. Sai, P. Chettri, R. Panigrahi, A. Garg, A. K. Bhoi, and P. Barsocchi, “An Ensemble of Light Gradient Boosting Machine and Adaptive Boosting for Prediction of Type-2 Diabetes,” Int. J. Comput. Intell. Syst., 2023, doi: 10.1007/s44196-023-00184-y.

S. R. Puspita Sari Jan, Y. A. Mustofa, and I. S. K. Idris, “Analisis Sentimen Terhadap Data Kuisioner Evaluasi Dosen Menggunakan Algoritma Naïve Bayes,” J. Inform. Upgris, vol. 9, no. 2, pp. 67–72, 2023, doi: 10.26877/jiu.v9i2.17001.

W. Gata and A. Bayhaqy, “Analysis Sentiment About Islamophobia When Christchurch Attack on Social Media,” Telkomnika (Telecommunication Comput. Electron. Control., 2020, doi: 10.12928/telkomnika.v18i4.14179.

A. W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-Language Texts,” Kinet. Game Technol. Inf. Syst. Comput. Netw. Comput. Electron. Control, 2019, doi: 10.22219/kinetik.v4i4.912.

S. Khomsah, A. F. Hidayatullah, and A. S. Aribowo, “Comparison of the Effects of Feature Selection and Tree-Based Ensemble Machine Learning for Sentiment Analysis on Indonesian YouTube Comments,” pp. 161–172, 2021, doi: 10.1007/978-981-33-6926-9_15.

E. Fersini, E. Messina, and F. Pozzi, “Sentiment Analysis: Bayesian Ensemble Learning,” Decis. Support Syst., 2014, doi: 10.1016/j.dss.2014.10.004.

J. Kavanagh, K. A. Greenhow, and A. Jordanous, “Assessing the Effects of Lemmatisation and Spell Checking on Sentiment Analysis of Online Reviews,” 2023, doi: 10.1109/icsc56153.2023.00046.

A. A. A. Shamsi and S. Abdallah, “Sentiment Analysis of Emirati Dialect,” Big Data Cogn. Comput., 2022, doi: 10.3390/bdcc6020057.

M. Maringer et al., “User-Documented Food Consumption Data From Publicly Available Apps: An Analysis of Opportunities and Challenges for Nutrition Research,” Nutr. J., vol. 17, no. 1, 2018, doi: 10.1186/s12937-018-0366-6.

A. Polhemus et al., “Health Tracking via Mobile Apps for Depression Self-Management: Qualitative Content Analysis of User Reviews,” Jmir Hum. Factors, vol. 9, no. 4, p. e40133, 2022, doi: 10.2196/40133.

M. R. Dehkordi, “Dynamic PScore: A Dynamic Method to Prioritize User Reviews,” 2023, doi: 10.21203/rs.3.rs-3790587/v1.

S. N. Apsariny, S. Sediono, N. Chamidah, E. Ana, and A. Kurniawan, “Sentiment Analysis of User Reviews Based on Naïve Bayes Classifier Algorithm With Hyperparameter Optimization: A Case Study on Application ‘Kredit Pintar,’” Syntax Lit. J. Ilm. Indones., 2022, doi: 10.36418/syntax-literate.v7i1.6012.

Y. A. Mustofa, “Data Ulasan Shopee dan Zoom app,” 2024. https://www.kaggle.com/datasets/yasinarilmustofa/data-ulasan-shopee-and-zoom-app.

H.-T. Duong and T.-A. Nguyen-Thi, “A Review: Preprocessing Techniques and Data Augmentation for Sentiment Analysis,” Comput. Soc. Networks, 2021, doi: 10.1186/s40649-020-00080-x.

A. Muhaddisi, “Sentiment Analysis With Sarcasm Detection on Politician’s Instagram,” Ijccs (Indonesian J. Comput. Cybern. Syst., 2021, doi: 10.22146/ijccs.66375.

J. Andoh, L. Asiedu, A. Lotsi, and C. Chapman-Wardy, “Statistical Analysis of Public Sentiment on the Ghanaian Government: A Machine Learning Approach,” Adv. Human-Computer Interact., 2021, doi: 10.1155/2021/5561204.

N. Al-Twairesh and H. Al-Negheimish, “Surface and Deep Features Ensemble for Sentiment Analysis of Arabic Tweets,” Ieee Access, 2019, doi: 10.1109/access.2019.2924314.

S. Poria, H. Peng, A. Hussain, N. Howard, and Z. Wang, “Ensemble Application of Convolutional Neural Networks and Multiple Kernel Learning for Multimodal Sentiment Analysis,” Neurocomputing, 2017, doi: 10.1016/j.neucom.2016.09.117.

A. Alsayat, “Improving Sentiment Analysis for Social Media Applications Using an Ensemble Deep Learning Language Model,” Arab. J. Sci. Eng., 2021, doi: 10.1007/s13369-021-06227-w.

W. Sharif et al., “An Empirical Approach for Extreme Behavior Identification Through Tweets Using Machine Learning,” Appl. Sci., 2019, doi: 10.3390/app9183723.

J. M. Deriu, M. Gonzenbach, F. Uzdilli, A. Lucchi, V. D. Luca, and M. Jaggi, “SwissCheese at SemEval-2016 Task 4: Sentiment Classification Using an Ensemble of Convolutional Neural Networks With Distant Supervision,” 2016, doi: 10.18653/v1/s16-1173.




DOI: https://doi.org/10.37905/jjeee.v6i2.25184

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Published by:
Electrical Engineering Department
Faculty of Engineering
State University of Gorontalo
Jenderal Sudirman Street No.6, Gorontalo City, Gorontalo Province, Indonesia
Telp. 0435-821175; 081340032063
Email: redaksijjeee@ung.ac.id/redaksijjeee@gmail.com

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.