Algoritma Random Forest dan Synthetic Minority Oversampling Technique (SMOTE) untuk Deteksi Diabetes

Authors

  • Nurussakinah Nurussakinah UIN Maulana Malik Ibrahim Malang
  • Muhammad Faisal UIN Maulana Malik Ibrahim Malang
  • Irwan Budi Santoso UIN Maulana Malik Ibrahim Malang

DOI:

https://doi.org/10.14421/jiska.2025.10.2.221-234

Keywords:

Detection, Diabetes, Random Forest, Synthetic Minority Oversampling Technique, Ensemble

Abstract

Diabetes is one of the challenges in global health. Indonesia ranks 5th in the world with the highest rate of diabetes. This research uses the Random Forest algorithm for diabetes detection. The purpose of the study is to detect diabetes with the Random Forest algorithm that provides accurate and efficient results in the early diagnosis of diabetic patients. The data used is secondary data "Diabetes Dataset" which consists of 952 data and has 17 features. The test scenario in this study divides the data consisting of 3 parts, namely scenario 1 90%:10% ratio, scenario 2 70%:30% ratio, scenario 3 50%:50% ratio. In each scenario, a comparison between using SMOTE and not using SMOTE is applied. The best performance results are obtained in scenario 1 which uses SMOTE, which produces 97% accuracy, 100% precision, 94% recall and the last is F1-Score which produces 97%.

References

Aris, F., & Benyamin, B. (2019). Penerapan Data Mining untuk Identifikasi Penyakit Diabetes Melitus dengan Menggunakan Metode Klasifikasi. Router Research, 1(1), 1–6. https://doi.org/10.29239/j.router.2019.313

Daghistani, T., & Alshammari, R. (2020). Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes. Journal of Advances in Information Technology, 11(2), 78–83. https://doi.org/10.12720/jait.11.2.78-83

Elfaladonna, F., & Rahmadani, A. (2019). Analisa Metode Classification-Decision Tree dan Algoritma C.45 untuk Memprediksi Penyakit Diabetes dengan Menggunakan Aplikasi Rapid Miner. SINTECH (Science and Information Technology) Journal, 2(1), 10–17. https://doi.org/10.31598/sintechjournal.v2i1.293

Faida, A. N., & Santik, Y. D. P. (2020). Kejadian Diabetes Melitus Tipe I pada Usia 10-30 Tahun. Higeia Journal of Public Health Research and Development, 4(1), 33–42. https://doi.org/10.15294/higeia/v4i1/31763

Hana, F. M. (2020). Klasifikasi Penderita Penyakit Diabetes Menggunakan Algoritma Decision Tree C4.5. Jurnal SISKOM-KB (Sistem Komputer dan Kecerdasan Buatan), 4(1), 32–39. https://doi.org/10.47970/siskom-kb.v4i1.173

Junus, C. Z. V., Tarno, T., & Kartikasari, P. (2023). Klasifikasi Menggunakan Metode Support Vector Machine dan Random Forest untuk Deteksi Awal Risiko Diabetes Melitus. Jurnal Gaussian, 11(3), 386–396. https://doi.org/10.14710/j.gauss.11.3.386-396

Karyadiputra, E., & Setiawan, A. (2022). Penerapan Data Mining untuk Prediksi Awal Kemungkinan Terindikasi Diabetes. Teknosains: Media Informasi Sains dan Teknologi, 16(2), 221–232. https://doi.org/10.24252/teknosains.v16i2.28257

Kementerian Kesehatan Republik Indonesia. (2023). Rencana Aksi Kerja Kegiatan Direktorat P2PTM 2021-2024 (1st ed.). Kementerian Kesehatan Republik Indonesia. https://www.scribd.com/document/757455987/RAK-Dit-P2PTM-1-465827-02-4tahunan-070

Magliano, D., & Boyko, E. J. (2013). Five Questions on the IDF Diabetes Atlas. Diabetes Research and Clinical Practice, 102(2), 147–148. https://doi.org/10.1016/j.diabres.2013.10.013

Mulia, C., & Kurniasih, A. (2023). Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Bank Customer Churn Menggunakan Algoritma Naïve bayes dan Logistic Regression. Prosiding Seminar Ilmiah Nasional Online Mahasiswa Ilmu Komputer dan Aplikasinya, 4(2), 552–559. https://conference.upnvj.ac.id/index.php/senamika/article/view/2590

Rajaraman, A., & Ullman, J. D. (2011). Data Mining. In Mining of Massive Datasets (Vol. 2, Issue January 2013, pp. 1–17). Cambridge University Press. https://doi.org/10.1017/CBO9781139058452.002

Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes Using Machine Learning Classification Methods. Procedia Computer Science, 167, 706–716. https://doi.org/10.1016/j.procs.2020.03.336

Tulu, T. W., Wan, T. K., Chan, C. L., Wu, C. H., Woo, P. Y. M., Tseng, C. Z. S., Vodencarevic, A., Menni, C., & Chan, K. H. K. (2023). Machine Learning-Based Prediction of COVID-19 Mortality Using Immunological and Metabolic Biomarkers. BMC Digital Health, 1(1), 6. https://doi.org/10.1186/s44247-022-00001-0

Witjaksana, E. C. P., Saedudin, Rd. R., & Widartha, V. P. (2021). Perbandingan Akurasi Algoritma Random Forest dan Algoritma Artificial Neural Network untuk Klasifikasi Penyakit Diabetes. EProceedings of Engineering, 8(5), 9773–9781. https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/15758

Zailani, A. U., & Hanun, N. L. (2020). Penerapan Algoritma Klasifikasi Random Forest untuk Penentuan Kelayakan Pemberian Kredit di Koperasi Mitra Sejahtera. Infotech: Journal of Technology Information, 6(1), 7–14. https://doi.org/10.37365/jti.v6i1.61

Downloads

Published

2025-05-31

How to Cite

Nurussakinah, N., Faisal, M., & Santoso, I. B. (2025). Algoritma Random Forest dan Synthetic Minority Oversampling Technique (SMOTE) untuk Deteksi Diabetes. JISKA (Jurnal Informatika Sunan Kalijaga), 10(2), 221–234. https://doi.org/10.14421/jiska.2025.10.2.221-234

Issue

Section

Articles