Algoritma Random Forest dan Synthetic Minority Oversampling Technique (SMOTE) untuk Deteksi Diabetes
DOI:
https://doi.org/10.14421/jiska.2025.10.2.221-234Keywords:
Detection, Diabetes, Random Forest, Synthetic Minority Oversampling Technique, EnsembleAbstract
Diabetes is one of the challenges in global health. Indonesia ranks 5th in the world with the highest rate of diabetes. This research uses the Random Forest algorithm for diabetes detection. The purpose of the study is to detect diabetes with the Random Forest algorithm that provides accurate and efficient results in the early diagnosis of diabetic patients. The data used is secondary data "Diabetes Dataset" which consists of 952 data and has 17 features. The test scenario in this study divides the data consisting of 3 parts, namely scenario 1 90%:10% ratio, scenario 2 70%:30% ratio, scenario 3 50%:50% ratio. In each scenario, a comparison between using SMOTE and not using SMOTE is applied. The best performance results are obtained in scenario 1 which uses SMOTE, which produces 97% accuracy, 100% precision, 94% recall and the last is F1-Score which produces 97%.
References
Aris, F., & Benyamin, B. (2019). Penerapan Data Mining untuk Identifikasi Penyakit Diabetes Melitus dengan Menggunakan Metode Klasifikasi. Router Research, 1(1), 1–6. https://doi.org/10.29239/j.router.2019.313
Daghistani, T., & Alshammari, R. (2020). Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes. Journal of Advances in Information Technology, 11(2), 78–83. https://doi.org/10.12720/jait.11.2.78-83
Elfaladonna, F., & Rahmadani, A. (2019). Analisa Metode Classification-Decision Tree dan Algoritma C.45 untuk Memprediksi Penyakit Diabetes dengan Menggunakan Aplikasi Rapid Miner. SINTECH (Science and Information Technology) Journal, 2(1), 10–17. https://doi.org/10.31598/sintechjournal.v2i1.293
Faida, A. N., & Santik, Y. D. P. (2020). Kejadian Diabetes Melitus Tipe I pada Usia 10-30 Tahun. Higeia Journal of Public Health Research and Development, 4(1), 33–42. https://doi.org/10.15294/higeia/v4i1/31763
Hana, F. M. (2020). Klasifikasi Penderita Penyakit Diabetes Menggunakan Algoritma Decision Tree C4.5. Jurnal SISKOM-KB (Sistem Komputer dan Kecerdasan Buatan), 4(1), 32–39. https://doi.org/10.47970/siskom-kb.v4i1.173
Junus, C. Z. V., Tarno, T., & Kartikasari, P. (2023). Klasifikasi Menggunakan Metode Support Vector Machine dan Random Forest untuk Deteksi Awal Risiko Diabetes Melitus. Jurnal Gaussian, 11(3), 386–396. https://doi.org/10.14710/j.gauss.11.3.386-396
Karyadiputra, E., & Setiawan, A. (2022). Penerapan Data Mining untuk Prediksi Awal Kemungkinan Terindikasi Diabetes. Teknosains: Media Informasi Sains dan Teknologi, 16(2), 221–232. https://doi.org/10.24252/teknosains.v16i2.28257
Kementerian Kesehatan Republik Indonesia. (2023). Rencana Aksi Kerja Kegiatan Direktorat P2PTM 2021-2024 (1st ed.). Kementerian Kesehatan Republik Indonesia. https://www.scribd.com/document/757455987/RAK-Dit-P2PTM-1-465827-02-4tahunan-070
Magliano, D., & Boyko, E. J. (2013). Five Questions on the IDF Diabetes Atlas. Diabetes Research and Clinical Practice, 102(2), 147–148. https://doi.org/10.1016/j.diabres.2013.10.013
Mulia, C., & Kurniasih, A. (2023). Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Bank Customer Churn Menggunakan Algoritma Naïve bayes dan Logistic Regression. Prosiding Seminar Ilmiah Nasional Online Mahasiswa Ilmu Komputer dan Aplikasinya, 4(2), 552–559. https://conference.upnvj.ac.id/index.php/senamika/article/view/2590
Rajaraman, A., & Ullman, J. D. (2011). Data Mining. In Mining of Massive Datasets (Vol. 2, Issue January 2013, pp. 1–17). Cambridge University Press. https://doi.org/10.1017/CBO9781139058452.002
Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes Using Machine Learning Classification Methods. Procedia Computer Science, 167, 706–716. https://doi.org/10.1016/j.procs.2020.03.336
Tulu, T. W., Wan, T. K., Chan, C. L., Wu, C. H., Woo, P. Y. M., Tseng, C. Z. S., Vodencarevic, A., Menni, C., & Chan, K. H. K. (2023). Machine Learning-Based Prediction of COVID-19 Mortality Using Immunological and Metabolic Biomarkers. BMC Digital Health, 1(1), 6. https://doi.org/10.1186/s44247-022-00001-0
Witjaksana, E. C. P., Saedudin, Rd. R., & Widartha, V. P. (2021). Perbandingan Akurasi Algoritma Random Forest dan Algoritma Artificial Neural Network untuk Klasifikasi Penyakit Diabetes. EProceedings of Engineering, 8(5), 9773–9781. https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/15758
Zailani, A. U., & Hanun, N. L. (2020). Penerapan Algoritma Klasifikasi Random Forest untuk Penentuan Kelayakan Pemberian Kredit di Koperasi Mitra Sejahtera. Infotech: Journal of Technology Information, 6(1), 7–14. https://doi.org/10.37365/jti.v6i1.61
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nurussakinah Nurussakinah, Muhammad Faisal, Irwan Budi Santoso

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who publish with this journal agree to the following terms as stated in http://creativecommons.org/licenses/by-nc/4.0
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.