|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 187 - Issue 57 |
| Published: November 2025 |
| Authors: Md. Iqbal Hossain, Najila Alam Porno |
10.5120/ijca2025925995
|
Md. Iqbal Hossain, Najila Alam Porno . Comprehensive Benchmarking of several Machine Learning and Bayesian Models for Early-Stage Diabetes Risk Prediction: A Large-Scale Comparative Study. International Journal of Computer Applications. 187, 57 (November 2025), 9-16. DOI=10.5120/ijca2025925995
@article{ 10.5120/ijca2025925995,
author = { Md. Iqbal Hossain,Najila Alam Porno },
title = { Comprehensive Benchmarking of several Machine Learning and Bayesian Models for Early-Stage Diabetes Risk Prediction: A Large-Scale Comparative Study },
journal = { International Journal of Computer Applications },
year = { 2025 },
volume = { 187 },
number = { 57 },
pages = { 9-16 },
doi = { 10.5120/ijca2025925995 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2025
%A Md. Iqbal Hossain
%A Najila Alam Porno
%T Comprehensive Benchmarking of several Machine Learning and Bayesian Models for Early-Stage Diabetes Risk Prediction: A Large-Scale Comparative Study%T
%J International Journal of Computer Applications
%V 187
%N 57
%P 9-16
%R 10.5120/ijca2025925995
%I Foundation of Computer Science (FCS), NY, USA
Diabetes remains a critical global health challenge, with early detection is crucial for effective management. This study presents a comprehensive benchmarking analysis of 14 diverse machine learning and Bayesian models for early-stage diabetes risk prediction using clinical data [2] from Sylhet, Bangladesh. This research evaluated traditional methods (Logistic Regression, Decision Trees), ensemble techniques (Random Forest, XGBoost, LightGBM), Bayesian approaches (BART, Bayesian Logistic Regression), and advanced neural architectures (Deep Belief Networks) using both 70-30 train-test splits and 10-fold crossvalidation. The results demonstrate that ensemble methods consistently outperformed other approaches, with Random Forest(RF) achieving the highest cross-validated AUC (0.9951) and accuracy (0.9699). The study provides valuable insights into model selection for clinical decision support systems and highlights the robustness of tree-based ensemble methods for medical diagnosis tasks.Diabetes Prediction, Machine Learning Benchmarking, Cross- Validation, Ensemble Methods, Bayesian Models, Clinical Decision Support