Pengembangan Model Prediksi Diabetes Melitus Menggunakan Metode Stochastic Gradient Boosting

Andrian Sah, Chaeroen Niesa, Amat Damuri, Nur Amalia Hasma

Abstract


Diabetes mellitus is one of the global health issues with a continuously increasing prevalence. Its high prevalence significantly impacts economic burdens and healthcare systems, as it often leads to severe complications such as cardiovascular diseases and kidney failure. Therefore, early prediction and detection of diabetes mellitus are crucial in mitigating its adverse effects. Data mining and machine learning technologies offer innovative solutions for processing complex medical data, providing deeper insights, and supporting data-driven decision-making. This study aims to develop a diabetes mellitus prediction model using the Stochastic Gradient Boosting (SGB) algorithm. The model utilizes a dataset comprising clinical variables such as glucose levels, blood pressure, body mass index (BMI), and genetic history to identify diabetes risk. The results indicate that the developed prediction model demonstrates high performance across various dataset splitting ratios: 70:30, 80:20, and 90:10. The model achieved the highest accuracy of 95.50% at the 70:30 ratio, with an AUC (Area Under the Curve) value of 0.9862, showcasing its ability to effectively differentiate between positive (diabetes) and negative (non-diabetes) classes. At the 80:20 and 90:10 ratios, the model achieved accuracies of 92.75% and 92.31%, with AUC values of 0.9767 and 0.9777, respectively, indicating consistent performance. The model’s high accuracy is attributed to the iterative boosting approach in the SGB algorithm, which adaptively corrects prediction errors at each iteration. Additionally, regulatory mechanisms such as learning rate and subsampling help prevent overfitting, making the algorithm effective for datasets with complex patterns.

Keywords


Data Mining; Diabetes Mellitus; Prediction Model; Machine Learning; Stochastic Gradient Boosting

Full Text:

PDF

References


T. Rawung, J. Posangi, and E. Nangoy, “Efektivitas Penggunaan Empagliflozin terhadap Nilai HbA1c pada Pasien Diabetes Melitus Tipe 2,” Med. Scope J., vol. 5, no. 2, pp. 232–239, 2023, doi: 10.35790/msj.v5i2.45424.

H. Y. Resti and W. H. Cahyati, “Kejadian Diabetes Melitus Pada Usia Produktif Di Puskesmas Kecamatan Pasar Rebo,” Higeia J. Public Heal. Res. Dev., vol. 6, no. 3, pp. 350–361, 2022.

K. W. D. Nugraha, T. Seviana, and F. Sibuea, Profil Kesehatan Indonesia 2022. Jakarta: Kementerian Kesehatan Republik Indonesia Jalan, 2023.

A. Priyanto and E. D. H. Suprayetno, Efektifitas Self Detection For Diabetic (SEDAB) Untuk Deteksi Dini Diabetes Militus. Malang: Media Nusa Creative (MNC Publishing), 2022.

A. Sah, J. Jusmawati, S. Nurhayati, M. Tonggiroh, and S. Bonay, “Sistem Informasi Manajemen Pada Puskesmas Kota Jayapura Berbasis Web,” JTIM J. Teknol. Inf. dan Multimed., vol. 4, no. 3, pp. 212–220, Nov. 2022.

A. Sah, S. Suhardi, and S. Nurhayati, “Geographic Information System of Patient Development in Jayapura Hospital During Pandemic,” J. Teknol. Dan Open Source, vol. 4, no. 2, pp. 149–154, 2021.

R. I. Borman and M. Wati, “Penerapan Data Maining Dalam Klasifikasi Data Anggota Kopdit Sejahtera Bandarlampung Dengan Algoritma Naïve Bayes,” J. Ilm. Fak. Ilmu Komput., vol. 9, no. 1, pp. 25–34, 2020.

R. I. Borman, R. Napianto, N. Nugroho, D. Pasha, Y. Rahmanto, and Y. E. P. Yudoutomo, “Implementation of PCA and KNN Algorithms in the Classification of Indonesian Medicinal Plants,” in International Conference on Computer Science, Information Technology and Electrical Engineering (ICOMITEE), IEEE, 2021, pp. 46–50.

P. N. Sabrina and A. Komarudin, “Prediksi Penyakit Diabetes Dengan Metode K-Nearest Neighbor (KNN) dan Seleksi Fitur Information Gain,” JATI (Jurnal Mhs. Tek. Inform., vol. 8, no. 6, pp. 11320–11326, 2024.

G. A. Putri, A. Trimaysella, and A. Khoiriah, “Penerapan Klasifikasi Data Mining pada Diabetes Menggunakan Metode Naive Bayes,” J. Ilmu Komput. Teknol. Terap., vol. 1, no. 14, pp. 1–9, 2024.

A. W. Mucholladin, F. A. Bachtiar, and M. T. Furqon, “Klasifikasi Penyakit Diabetes menggunakan Metode Support Vector Machine,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 5, no. 2, pp. 622–633, 2021.

R. Ahsana, R. Rohmat Saedudin, and V. P. Widartha, “Perbandingan Akurasi Algoritma Adaboost dan Algoritma Lightgbm Untuk Klasifikasi Penyakit Diabetes,” e-Proceeding Eng., vol. 8, no. 5, pp. 9738–9748, 2021.

A. F. L. Ptr, M. M. Siregar, and I. Daniel, “Analysis of Gradient Boosting, XGBoost, and CatBoost on Mobile Phone Classification,” J. Comput. Networks, Archit. High Perform. Comput., vol. 6, no. 2, pp. 661–670, 2024.

I. K. Ananda, A. Z. Fanani, D. Setiawan, and D. F. Wicaksono, “Penerapan Random Oversampling dan Algoritma Boosting untuk Memprediksi Kualitas Buah Jeruk,” Edumatic J. Pendidik. Inform., vol. 8, no. 1, pp. 282–289, 2024.

U. Schroeders, C. Schmidt, and T. Gnambs, “Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting,” Educ. Psychol. Meas., vol. 82, no. 1, pp. 29–56, Apr. 2021, doi: 10.1177/00131644211004708.

A. Subasi, M. F. El-Amin, T. Darwich, and M. Dossary, “Permeability prediction of petroleum reservoirs using stochastic gradient boosting regression,” J. Ambient Intell. Humaniz. Comput., vol. 13, no. 7, pp. 3555–3564, 2022, doi: 10.1007/s12652-020-01986-0.

J. Dasilva, “Diabetes Dataset,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/johndasilva/diabetes

R. I. Borman, Y. Fernando, and Y. E. P. Yudoutomo, “Identification of Vehicle Types Using Learning Vector Quantization Algorithm with Morphological Features,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 2, pp. 339–345, 2022.

M. F. El-Amin, “Turbulent Reynolds Stresses Prediction using Stochastic Gradient Boosting Regression,” in 21st Learning and Technology Conference (L&T), 2024, pp. 139–143. doi: 10.1109/LT60077.2024.10468734.

R. Setiawan, A. T. Wibowo, and M. Ridwan, “Pengembangan Sistem Rekomendasi Atlet Esports Berdasarkan Prediksi Elo Rating Menggunakan Model Stochastic Gradient Boosting,” J. Format, vol. 11, no. 2, pp. 145–152, 2022.

M. Kivrak, “Breast Cancer Risk Prediction with Stochastic Gradient Boosting,” Clin. Cancer Investig. J., vol. 11, no. 2, pp. 26–31, 2022, doi: 10.51847/21qrrkLo4Y.

E. E. Başakın, Ö. Ekmekcioğlu, P. C. Stoy, and M. Özger, “Estimation of daily reference evapotranspiration by hybrid singular spectrum analysis-based stochastic gradient boosting,” MethodsX, vol. 10, p. 102163, 2023, doi: https://doi.org/10.1016/j.mex.2023.102163.

N. Hafiizah and R. A. Saputra, “Klasifikasi Kematangan Buah Jeruk Berdasarkan Fitur Warna Menggunakan Metode SVM,” FORMAT J. Ilm. Tek. Inform., vol. 13, no. 1, pp. 55–65, 2024.

R. I. Borman, F. Rossi, Y. Jusman, A. A. A. Rahni, S. D. Putra, and A. Herdiansah, “Identification of Herbal Leaf Types Based on Their Image Using First Order Feature Extraction and Multiclass SVM Algorithm,” in International Conference on Electronic and Electrical Engineering and Intelligent System (ICE3IS), IEEE, 2021, pp. 12–17.




DOI: http://dx.doi.org/10.22441/format.2025.v14.i1.002

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Format : Jurnal Ilmiah Teknik Informatika

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Format : Jurnal Ilmiah Teknik Informatika
Fakultas Ilmu Komputer Universitas Mercu Buana
Jl. Raya Meruya Selatan, Kembangan, Jakarta 11650
Tlp./Fax: +62215840816
http://publikasi.mercubuana.ac.id/index.php/format

p-ISSN: 2089-5615
e-ISSN: 2722-7162

 Lisensi Creative Commons
Ciptaan disebarluaskan di bawah Lisensi Creative Commons Atribusi-NonKomersial 4.0 Internasional.

View My Stats