Analisis Kinerja dan Efisiensi Energi k-means dan Gaussian Mixture Model Terdistribusi pada Klaster Single Board Computer dan Personal Computer dengan Apache Spark

Deffin Purnama Noer, Muhaza Liebenlito, Taufik Edy Sutanto

Abstract


This study aims to evaluate the performance and energy efficiency of distributed unsupervised learning algorithms on two types of clusters, namely Single Board Computers (SBC) and Personal Computers (PC), using Apache Spark. Two algorithms were tested—k-means and Gaussian Mixture Model (GMM)—executed across varying dataset sizes and numbers of processor cores to observe scalability. The results show that PCs consistently achieved faster execution times, particularly with k-means on large datasets. On the other hand, SBCs demonstrated higher energy efficiency in all scenarios, with energy savings of up to 93% for k-means and 86% for GMM compared to the highest-consumption configuration on PC. These findings affirm the potential of SBCs as a low-power and cost-efficient solution for green or sustainable computing, particularly for learning, academic experimentation, and small-scale edge computing development, and are relevant to sustainability efforts through their contribution to the Sustainable Development Goals (SDGs).

Keywords


Apache Spark; energy efficiency; Gaussian Mixture Model; k-means; Single Board Computer

Full Text:

PDF

References


A. S. George, A. S. H. George, and A. S. G. Martin, “The environmental impact of AI: A case study of water consumption by ChatGPT,” Partners Universal International Innovation Journal, vol. 1, no. 2, pp. 97–104, 2023. [Online]. Available: https://puiij.com/index.php/research/article/view/39

. [Accessed: 17-Jun.-2025].

E. Masanet, A. Shehabi, N. Lei, S. Smith, and J. Koomey, “Recalibrating global data center energy-use estimates,” Science, vol. 367, no. 6481, 2020, doi: 10.1126/science.aba3758.

P. Li, J. Yang, M. A. Islam, and S. Ren, “Making AI less ‘thirsty’: Uncovering and addressing the secret water footprint of AI models,” arXiv, Apr. 2023. [Online]. Available: http://arxiv.org/abs/2304.03271

. [Accessed: 19-Jun.-2025].

C.-J. Wu et al., “Sustainable AI: Environmental implications, challenges, and opportunities,” arXiv, Oct. 2021. [Online]. Available: http://arxiv.org/abs/2111.00364

. [Accessed: 22-Jun.-2025].

Goldman Sachs, “AI is poised to drive 160% increase in power demand.” [Online]. Available: https://www.goldmansachs.com/insights/articles/AI-poised-to-drive-160-increase-in-power-demand

. [Accessed: 17-Jun.-2025].

R. Schwartz, J. Dodge, N. A. Smith, and O. Etzioni, “Green AI,” Communications of the ACM, vol. 63, no. 12, pp. 54–63, Nov. 2020, doi: 10.1145/3381831.

S. J. Johnston et al., “Commodity single-board computer clusters and their applications,” Future Generation Computer Systems, vol. 89, pp. 201–212, 2018, doi: 10.1016/j.future.2018.06.048.

S. Bourhnane, M. R. Abid, K. Zine-Dine, N. Elkamoun, and D. Benhaddou, “Cluster of single-board computers at the edge for smart grids applications,” Applied Sciences, vol. 11, no. 22, Nov. 2021, doi: 10.3390/app112210981.

E. Lee, H. Oh, and D. Park, “Big data processing on single-board computer clusters: Exploring challenges and possibilities,” IEEE Access, vol. 9, pp. 142551–142565, 2021, doi: 10.1109/ACCESS.2021.3120660.

Apache Software Foundation, “Apache Spark™: Unified engine for large-scale data analytics.” [Online]. Available: https://spark.apache.org/

. [Accessed: 19-Jun.-2025].

B. Qureshi and A. Koubaa, “On energy efficiency and performance evaluation of single-board computer-based clusters: A Hadoop case study,” Electronics, vol. 8, no. 2, 2019, doi: 10.3390/electronics8020182.

United Nations, “Progress towards the Sustainable Development Goals: Report of the Secretary-General 2024,” 2024. [Online]. Available: https://sdgs.un.org/goals

. [Accessed: 20-Jun.-2025].

United Nations, “The 17 goals | Sustainable development.” [Online]. Available: https://sdgs.un.org/goals

. [Accessed: 25-Jun.-2025].

M. Zaharia et al., “Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing,” in Proc. 9th USENIX Symp. Networked Systems Design and Implementation (NSDI), 2012, pp. 15–28. [Online]. Available: https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf

. [Accessed: 3-Jul.-2025].

Y. Feng, J. Zou, W. Liu, and F. Lv, “Distributed K-means algorithm based on a Spark optimization sample,” PLoS ONE, vol. 19, no. 12, pp. 1–21, 2024, doi: 10.1371/journal.pone.0308993.

A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. Royal Statistical Society, Series B, vol. 39, no. 1, pp. 1–38, 1977, doi: 10.1111/j.2517-6161.1977.tb01600.x.

A. A. Ratnaparkhi, E. Pilli, and R. C. Joshi, “Scaling GMM expectation maximization algorithm using bulk synchronous parallel approach,” in Proc. Int. Conf. Green Computing and Internet of Things (ICGCIoT), 2015, pp. 558–562, 2016, doi: 10.1109/ICGCIoT.2015.7380527.




DOI: https://doi.org/10.37905/jjom.v8i1.35198



Copyright (c) 2026 Deffin Purnama Noer, Muhaza Liebenlito, Taufik Edy Sutanto

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Jambura Journal of Mathematics has been indexed by

>>>More Indexing<<<


Creative Commons License

Jambura Journal of Mathematics (e-ISSN: 2656-1344) by Department of Mathematics Universitas Negeri Gorontalo is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Powered by Public Knowledge Project OJS. 


Editorial Office


Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Negeri Gorontalo
Jl. Prof. Dr. Ing. B. J. Habibie, Moutong, Tilongkabila, Kabupaten Bone Bolango, Gorontalo, Indonesia
Email: [email protected].