Predicting Mortality Rates of Foodborne Bacteria Using Machine Learning: A Comparative Study of Regression Models
DOI:
https://doi.org/10.5147/jaimlb.vi.258Keywords:
Foodborne bacteria, Machine learning (ML), Mortality prediction, Antimicrobial resistance, Gradient Boosting Regressor (GBR), Random Forest (RF), SHAP analysis, Public healthAbstract
Foodborne bacterial infections remain a major public health concern, contributing to significant morbidity and mortality worldwide. Understanding the genomic and epidemiological factors that influence bacterial mortality rates is crucial for developing effective risk assessment strategies. In this study, we applied machine learning (ML) models to predict mortality rates of 50 foodborne bacterial species using genomic, virulence, antimicrobial resistance (AMR), and epidemiological features. Five regression models were evaluated: Linear Regression (LR), Random Forest (RF), Gradient Boosting Regressor (GBR), Support Vector Regressor (SVR), and K-Nearest Neighbors (KNN). Our results indicate that ensemble models (RF, GBR) outperform traditional linear regression in capturing the complex relationships between bacterial features and mortality rates. Feature importance analysis revealed that annual reported cases worldwide, genome size, GC content, and virulence gene count are the strongest predictors of mortality. Interestingly, AMR gene count had a lower-than-expected impact, suggesting that antibiotic resistance alone does not strongly determine mortality outcomes. SHapley Additive exPlanation (SHAP) analysis confirmed the significance of genomic and epidemiological factors in shaping model predictions. However, all models exhibited low R² scores and high Mean Absolute Error (MAE), indicating room for improvement. Residual analysis suggests that outliers and data variability may be limiting model performance. Future research should explore larger datasets, feature engineering, and advanced deep learning approaches to enhance predictive accuracy. Despite these limitations, this study demonstrates the potential of ML in quantifying bacterial pathogenicity and informing food safety and public health decision-making.
Downloads
Published
Versions
- 07/04/2025 (2)
- 06/06/2025 (1)
Issue
Section
License
Copyright (c) 2025 DreShawn Bradford, My Abdelmajid Kassem

This work is licensed under a Creative Commons Attribution 4.0 International License.