EVALUASI PERBANDINGAN KINERJA MODEL MACHINE LEARNING UNTUK PREDIKSI DIABETES: STUDI KASUS XGBOOST, RANDOM FOREST, DAN SVM

  • Adil Setiawan Universitas Potensi Utama
  • Adelina Universitas Potensi Utama
  • Damri Mulia Hutabalian Universitas Potensi Utama
  • Ricky Irnanda Universitas Potensi Utama
  • Heru Fredi Universitas Potensi Utama
  • Iswanto Universitas Potensi Utama

Abstract

This study evaluates and compares the performance of three major machine learning (ML) models—XGBoost, Random Forest, and Support Vector Machine (SVM)—for diabetes risk prediction using the Pima Indians Diabetes Dataset. The core problem addressed is the need for accurate and effective early detection to mitigate serious complications such as cardiovascular disease and kidney failure. The proposed solution involves training and evaluating these models on a pre-processed dataset, using metrics like accuracy, precision, recall, F1-score, and Area Under the Curve (AUC) on the ROC Curve. Random Forest achieved the best performance, showing the highest accuracy (0.76) and AUC (0.82). Furthermore, Random Forest was superior in detecting positive cases (diabetes), as evidenced by the confusion matrix analysis, which is critical in a medical context. Glucose and BMI were identified as the most crucial features for prediction across the models. The key finding is that Random Forest is the most effective and stable model, providing better discriminative abilities for clinical decision support in early diabetes risk prediction.

Downloads

Download data is not yet available.
Published
2024-12-31
How to Cite
Setiawan, A., Adelina, Mulia Hutabalian, D., Irnanda, R., Fredi, H., & Iswanto. (2024). EVALUASI PERBANDINGAN KINERJA MODEL MACHINE LEARNING UNTUK PREDIKSI DIABETES: STUDI KASUS XGBOOST, RANDOM FOREST, DAN SVM. INFOKOM (Informatika & Komputer), 12(2), 105-115. https://doi.org/10.56689/infokom.v12i2.2350