Predicting Mortality Rates of Foodborne Bacteria Using Machine Learning: A Comparative Study of Regression Models

DreShawn Bradford; My Abdelmajid Kassem

doi:10.5147/jaimlb.vi.258

Authors

DreShawn Bradford Plant Genomics and Bioinformatics Lab, Department of Biological and Forensic Sciences, Fayetteville State University, Fayetteville, NC 28301, USA
My Abdelmajid Kassem Plant Genomics and Bioinformatics Lab, Department of Biological and Forensic Sciences, Fayetteville State University, Fayetteville, NC 28301, USA https://orcid.org/0000-0003-3478-0327

DOI:

https://doi.org/10.5147/jaimlb.vi.258

Keywords:

Foodborne bacteria, Machine learning (ML), Mortality prediction, Antimicrobial resistance, Gradient Boosting Regressor (GBR), Random Forest (RF), SHAP analysis, Public health

Abstract

Foodborne bacterial infections remain a major public health concern, contributing to significant morbidity and mortality worldwide. Understanding the genomic and epidemiological factors that influence bacterial mortality rates is crucial for developing effective risk assessment strategies. In this study, we applied machine learning (ML) models to predict mortality rates of 50 foodborne bacterial species using genomic, virulence, antimicrobial resistance (AMR), and epidemiological features. Five regression models were evaluated: Linear Regression (LR), Random Forest (RF), Gradient Boosting Regressor (GBR), Support Vector Regressor (SVR), and K-Nearest Neighbors (KNN). Our results indicate that ensemble models (RF, GBR) outperform traditional linear regression in capturing the complex relationships between bacterial features and mortality rates. Feature importance analysis revealed that annual reported cases worldwide, genome size, GC content, and virulence gene count are the strongest predictors of mortality. Interestingly, AMR gene count had a lower-than-expected impact, suggesting that antibiotic resistance alone does not strongly determine mortality outcomes. SHapley Additive exPlanation (SHAP) analysis confirmed the significance of genomic and epidemiological factors in shaping model predictions. However, all models exhibited low R² scores and high Mean Absolute Error (MAE), indicating room for improvement. Residual analysis suggests that outliers and data variability may be limiting model performance. Future research should explore larger datasets, feature engineering, and advanced deep learning approaches to enhance predictive accuracy. Despite these limitations, this study demonstrates the potential of ML in quantifying bacterial pathogenicity and informing food safety and public health decision-making.

Predicting Mortality Rates of Foodborne Bacteria Using Machine Learning: A Comparative Study of Regression Models

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Versions

Issue

Section

License

How to Cite

Similar Articles

Most read articles by the same author(s)

Similar Articles

Advancing AI, ML, and Bioinformatics for Transformative Research Across Disciplines

Harnessing Machine Learning for Transformation in Agricultural Sciences: A Review

Revolutionizing Forensic Science: The Role of Artificial Intelligence and Machine Learning

Comparative Phylogenetic Analysis of Six Angiosperm Families Using rbcL and matK Chloroplast Markers