Pengembangan Model Prediksi Diabetes Melitus Menggunakan Metode Stochastic Gradient Boosting
DOI:
https://doi.org/10.22441/format.2025.v14.i1.002Keywords:
Data Mining, Diabetes Mellitus, Prediction Model, Machine Learning, Stochastic Gradient BoostingAbstract
Diabetes mellitus is one of the global health issues with a continuously increasing prevalence. Its high prevalence significantly impacts economic burdens and healthcare systems, as it often leads to severe complications such as cardiovascular diseases and kidney failure. Therefore, early prediction and detection of diabetes mellitus are crucial in mitigating its adverse effects. Data mining and machine learning technologies offer innovative solutions for processing complex medical data, providing deeper insights, and supporting data-driven decision-making. This study aims to develop a diabetes mellitus prediction model using the Stochastic Gradient Boosting (SGB) algorithm. The model utilizes a dataset comprising clinical variables such as glucose levels, blood pressure, body mass index (BMI), and genetic history to identify diabetes risk. The results indicate that the developed prediction model demonstrates high performance across various dataset splitting ratios: 70:30, 80:20, and 90:10. The model achieved the highest accuracy of 95.50% at the 70:30 ratio, with an AUC (Area Under the Curve) value of 0.9862, showcasing its ability to effectively differentiate between positive (diabetes) and negative (non-diabetes) classes. At the 80:20 and 90:10 ratios, the model achieved accuracies of 92.75% and 92.31%, with AUC values of 0.9767 and 0.9777, respectively, indicating consistent performance. The model’s high accuracy is attributed to the iterative boosting approach in the SGB algorithm, which adaptively corrects prediction errors at each iteration. Additionally, regulatory mechanisms such as learning rate and subsampling help prevent overfitting, making the algorithm effective for datasets with complex patterns.Downloads
References
T. Rawung, J. Posangi, and E. Nangoy, “Efektivitas Penggunaan Empagliflozin terhadap Nilai HbA1c pada Pasien Diabetes Melitus Tipe 2,” Med. Scope J., vol. 5, no. 2, pp. 232–239, 2023, doi: 10.35790/msj.v5i2.45424.
H. Y. Resti and W. H. Cahyati, “Kejadian Diabetes Melitus Pada Usia Produktif Di Puskesmas Kecamatan Pasar Rebo,” Higeia J. Public Heal. Res. Dev., vol. 6, no. 3, pp. 350–361, 2022.
K. W. D. Nugraha, T. Seviana, and F. Sibuea, Profil Kesehatan Indonesia 2022. Jakarta: Kementerian Kesehatan Republik Indonesia Jalan, 2023.
A. Priyanto and E. D. H. Suprayetno, Efektifitas Self Detection For Diabetic (SEDAB) Untuk Deteksi Dini Diabetes Militus. Malang: Media Nusa Creative (MNC Publishing), 2022.
A. Sah, J. Jusmawati, S. Nurhayati, M. Tonggiroh, and S. Bonay, “Sistem Informasi Manajemen Pada Puskesmas Kota Jayapura Berbasis Web,” JTIM J. Teknol. Inf. dan Multimed., vol. 4, no. 3, pp. 212–220, Nov. 2022.
A. Sah, S. Suhardi, and S. Nurhayati, “Geographic Information System of Patient Development in Jayapura Hospital During Pandemic,” J. Teknol. Dan Open Source, vol. 4, no. 2, pp. 149–154, 2021.
R. I. Borman and M. Wati, “Penerapan Data Maining Dalam Klasifikasi Data Anggota Kopdit Sejahtera Bandarlampung Dengan Algoritma Naïve Bayes,” J. Ilm. Fak. Ilmu Komput., vol. 9, no. 1, pp. 25–34, 2020.
R. I. Borman, R. Napianto, N. Nugroho, D. Pasha, Y. Rahmanto, and Y. E. P. Yudoutomo, “Implementation of PCA and KNN Algorithms in the Classification of Indonesian Medicinal Plants,” in International Conference on Computer Science, Information Technology and Electrical Engineering (ICOMITEE), IEEE, 2021, pp. 46–50.
P. N. Sabrina and A. Komarudin, “Prediksi Penyakit Diabetes Dengan Metode K-Nearest Neighbor (KNN) dan Seleksi Fitur Information Gain,” JATI (Jurnal Mhs. Tek. Inform., vol. 8, no. 6, pp. 11320–11326, 2024.
G. A. Putri, A. Trimaysella, and A. Khoiriah, “Penerapan Klasifikasi Data Mining pada Diabetes Menggunakan Metode Naive Bayes,” J. Ilmu Komput. Teknol. Terap., vol. 1, no. 14, pp. 1–9, 2024.
A. W. Mucholladin, F. A. Bachtiar, and M. T. Furqon, “Klasifikasi Penyakit Diabetes menggunakan Metode Support Vector Machine,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 5, no. 2, pp. 622–633, 2021.
R. Ahsana, R. Rohmat Saedudin, and V. P. Widartha, “Perbandingan Akurasi Algoritma Adaboost dan Algoritma Lightgbm Untuk Klasifikasi Penyakit Diabetes,” e-Proceeding Eng., vol. 8, no. 5, pp. 9738–9748, 2021.
A. F. L. Ptr, M. M. Siregar, and I. Daniel, “Analysis of Gradient Boosting, XGBoost, and CatBoost on Mobile Phone Classification,” J. Comput. Networks, Archit. High Perform. Comput., vol. 6, no. 2, pp. 661–670, 2024.
I. K. Ananda, A. Z. Fanani, D. Setiawan, and D. F. Wicaksono, “Penerapan Random Oversampling dan Algoritma Boosting untuk Memprediksi Kualitas Buah Jeruk,” Edumatic J. Pendidik. Inform., vol. 8, no. 1, pp. 282–289, 2024.
U. Schroeders, C. Schmidt, and T. Gnambs, “Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting,” Educ. Psychol. Meas., vol. 82, no. 1, pp. 29–56, Apr. 2021, doi: 10.1177/00131644211004708.
A. Subasi, M. F. El-Amin, T. Darwich, and M. Dossary, “Permeability prediction of petroleum reservoirs using stochastic gradient boosting regression,” J. Ambient Intell. Humaniz. Comput., vol. 13, no. 7, pp. 3555–3564, 2022, doi: 10.1007/s12652-020-01986-0.
J. Dasilva, “Diabetes Dataset,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/johndasilva/diabetes
R. I. Borman, Y. Fernando, and Y. E. P. Yudoutomo, “Identification of Vehicle Types Using Learning Vector Quantization Algorithm with Morphological Features,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 2, pp. 339–345, 2022.
M. F. El-Amin, “Turbulent Reynolds Stresses Prediction using Stochastic Gradient Boosting Regression,” in 21st Learning and Technology Conference (L&T), 2024, pp. 139–143. doi: 10.1109/LT60077.2024.10468734.
R. Setiawan, A. T. Wibowo, and M. Ridwan, “Pengembangan Sistem Rekomendasi Atlet Esports Berdasarkan Prediksi Elo Rating Menggunakan Model Stochastic Gradient Boosting,” J. Format, vol. 11, no. 2, pp. 145–152, 2022.
M. Kivrak, “Breast Cancer Risk Prediction with Stochastic Gradient Boosting,” Clin. Cancer Investig. J., vol. 11, no. 2, pp. 26–31, 2022, doi: 10.51847/21qrrkLo4Y.
E. E. Başakın, Ö. Ekmekcioğlu, P. C. Stoy, and M. Özger, “Estimation of daily reference evapotranspiration by hybrid singular spectrum analysis-based stochastic gradient boosting,” MethodsX, vol. 10, p. 102163, 2023, doi: https://doi.org/10.1016/j.mex.2023.102163.
N. Hafiizah and R. A. Saputra, “Klasifikasi Kematangan Buah Jeruk Berdasarkan Fitur Warna Menggunakan Metode SVM,” FORMAT J. Ilm. Tek. Inform., vol. 13, no. 1, pp. 55–65, 2024.
R. I. Borman, F. Rossi, Y. Jusman, A. A. A. Rahni, S. D. Putra, and A. Herdiansah, “Identification of Herbal Leaf Types Based on Their Image Using First Order Feature Extraction and Multiclass SVM Algorithm,” in International Conference on Electronic and Electrical Engineering and Intelligent System (ICE3IS), IEEE, 2021, pp. 12–17.
Downloads
Published
How to Cite
Issue
Section
License
The copyright to this article is transferred to Universitas Mercu Buana (UMB) if and when the article is accepted for publication. The undersigned hereby transfers any and all rights in and to the paper including without limitation all copyrights to UMB. The undersigned hereby represents and warrants that the paper is original and that he/she is the author of the paper, except for material that is clearly identified as to its original source, with permission notices from the copyright owners where required. The undersigned represents that he/she has the power and authority to make and execute this assignment.
We declare that this paper has not been published in the same form elsewhere.
Furthermore, I/We hereby transfer the unlimited rights of publication of the above-mentioned paper as a whole to UMB. The copyright transfer covers the right to reproduce and distribute the article, including reprints, translations, photographic reproductions, microform, electronic form (offline, online) or any other reproductions of similar nature.
The corresponding author signs for and accepts responsibility for releasing this material on behalf of any and all co-authors. This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s) where applicable. After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted.
Retained Rights/Terms and Conditions
Although authors are permitted to re-use all or portions of the Work in other works, this does not include granting third-party requests for reprinting, republishing, or other types of re-use.
Our Articles are licensed under CC BY-NC

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.