Perbandingan Kinerja Naïve Bayes dan Random Forest dalam Mendeteksi Berita Palsu

William William; Teny Handhayani

doi:10.14421/jiska.2025.10.2.137-144

Authors

William William Universitas Tarumanagara
Teny Handhayani Universitas Tarumanagara

DOI:

https://doi.org/10.14421/jiska.2025.10.2.137-144

Keywords:

Random Forest, Naive Bayes Algorithm, Text Classification, Fake News Detection, Machine Learning

Abstract

Fake news has become a serious problem in today's digital era. The existence of fake news can have various negative impacts, including the spread of misinformation, social unrest, and economic losses. This study compares the performance of Naïve Bayes and Random Forest classification methods in detecting fake news. Both methods were evaluated on a news dataset comprising 44,898 samples. It uses public data from the Kaggle repository. The news samples are represented by four features: title, news content, subject, and news date. This data is then subjected to cleaning, stemming, tokenization, and feature extraction. The results indicate that the Random Forest method outperforms the Naïve Bayes method. The Random Forest method has an accuracy of 99%, while the Naïve Bayes method has an accuracy of 96%. In general, this research demonstrates that the Random Forest method can be a viable alternative for detecting fake news.

References

Abdullah, S., & Prasetyo, G. (2020). Easy Ensemble with Random Forest to Handle Imbalanced Data in Classification. Journal of Fundamental Mathematics and Applications (JFMA), 3(1), 39–46. https://doi.org/10.14710/jfma.v3i1.7415

Anand, A., Kulkarni, R., & Agrawal, P. (2023). Fake News Identification: An Effective Combined Approach using ML and DL Techniques. 2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS), 1–6. https://doi.org/10.1109/PCEMS58491.2023.10136087

Ariatmanto, D., & Rifai, A. M. (2024). The Impact of Feature Extraction in Random Forest Classifier for Fake News Detection. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 8(6), 730–736. https://doi.org/10.29207/resti.v8i6.6017

Arora, Y., & Sikka, S. (2023). Reviewing Fake News Classification Algorithms. In pp (pp. 425–429). https://doi.org/10.1007/978-981-19-2065-3_46

Breiman, L. (2001). Random Forests. In Machine Learning (Vol. 45, Issue 1, pp. 5–32). Springer. https://doi.org/10.1023/A:1010933404324/METRICS

Hanum, A. R., Zetha, I. A., Putri, S. C., Wulandari, R. A., Andina, S. P., Fajrina, J. N., & Yudistira, N. (2024). Analisis Kinerja Algoritma Klasifikasi Teks Bert dalam Mendeteksi Berita Hoaks. Jurnal Teknologi Informasi dan Ilmu Komputer, 11(3), 537–546. https://doi.org/10.25126/jtiik.938093

Lazuardi, M. F., Hiunarto, R., Ramadhani, K. F., Noviandi, N., Widayanti, R., & Arfian, M. H. (2023). Hoax News Detection Using Passive Aggressive Classifier and TfidfVectorizer. Jurnal Teknik Informatika, 16(2), 185–193. https://doi.org/10.15408/jti.v16i2.34084

Nath, K., Soni, P., Anjum, Ahuja, A., & Katarya, R. (2021). Study of Fake News Detection Using Machine Learning and Deep Learning Classification Methods. 2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT), 434–438. https://doi.org/10.1109/RTEICT52294.2021.9573583

Praha, T. C., Widodo, W., & Nugraheni, M. (2024). Indonesian Fake News Classification Using Transfer Learning in CNN and LSTM. JOIV : International Journal on Informatics Visualization, 8(3), 1213. https://doi.org/10.62527/joiv.8.2.2126

Qubra, R., & Saputra, R. A. (2024). Classification of Hoax News Using the Naïve Bayes Method. International Journal Software Engineering and Computer Science (IJSECS), 4(1), 40–48. https://doi.org/10.35870/ijsecs.v4i1.2068

Rai, A., & Borah, S. (2021). Study of Various Methods for Tokenization. In pp (pp. 193–200). https://doi.org/10.1007/978-981-15-6198-6_18

Rianto, R., Mutiara, A. B., Wibowo, E. P., & Santosa, P. I. (2021). Improving the Accuracy of Text Classification Using Stemming Method: A Case of Non-Formal Indonesian Conversation. Journal of Big Data, 8(1), 26. https://doi.org/10.1186/s40537-021-00413-1

Santoso, H. A., Rachmawanto, E. H., Nugraha, A., Nugroho, A. A., Rosal Ignatius Moses Setiadi, D., & Basuki, R. S. (2020). Hoax Classification and Sentiment Analysis of Indonesian News Using Naive Bayes Optimization. TELKOMNIKA (Telecommunication Computing Electronics and Control), 18(2), 799. https://doi.org/10.12928/telkomnika.v18i2.14744

Sarica, S., & Luo, J. (2021). Stopwords in Technical Language Processing. PLOS ONE, 16(8), e0254937. https://doi.org/10.1371/journal.pone.0254937

Solanki, A., & Saxena, R. (2020). Text Classification Using Self-Structure Extended Multinomial Naive Bayes. In pp (pp. 107–129). https://doi.org/10.4018/978-1-5225-9643-1.ch006

Song, X., Salcianu, A., Song, Y., Dopson, D., & Zhou, D. (2021). Fast WordPiece Tokenization.

Yerlekar, A., Mungale, N., & Wazalwar, S. (2021). A Multinomial Technique for Detecting Fake News Using the Naive Bayes Classifier. 2021 International Conference on Computational Intelligence and Computing Applications (ICCICA), 1–5. https://doi.org/10.1109/ICCICA52458.2021.9697244

Zollanvari, A. (2023). Supervised Learning in Practice: The First Application Using Scikit-Learn. In Machine Learning with Python (pp. 111–131). Springer International Publishing. https://doi.org/10.1007/978-3-031-33342-2_4