Rapid Serotyping of Salmonella via MALDI-TOF and Machine Learning

Summary: Ren et al. show that combining MALDI-TOF MS with machine learning (XGBoost) can accurately identify Salmonella serotypes. This combined approach might soon provide diagnostic labs with a rapid and cost-effective approach to resolving Salmonella serotypes.
Serotyping Salmonella Using MALDI-TOF and AI
Faster Salmonella Serotyping by Combining MALDI-TOF With AI
Why this matters: 
  • There are over 2,600 Salmonella serotypes. 
  • Different serotypes are associated with distinct disease outcomes, host ranges, and epidemiologic patterns.
  • Rapid, accurate serotyping is important for outbreak tracking, epidemiology, and clinical guidance. Faster diagnostics mean faster response. 
  • Traditional serotyping (agglutination) is slow, laborious, and hard to scale. 
  • Combining MALDI-TOF MS with machine learning may provide improved serotyping in the future.

Key findings:  Ren et al. evaluated MALDI-TOF MS combined with machine learning to identify Salmonella serotypes.1

  • 10 different models were trained using 692 isolates from one hospital to identify eight Salmonella serotypes (B, C1, C2/3, D, E, Not A-F, Salmonella Typhimurium, and Salmonella Enteritidis).
  • From 192 spectral features initially assessed, 16 features were selected to distinguish between the 8 different serotypes.
  • The models were internally and then externally validated on isolates from another hospital. 
  • The XGBoost model achieved XGBoost had AUC ~0.99 in training, ~0.97 in validation sets; sensitivity & specificity was high,
  • To enhance usability, the model was deployed as a Streamlit-based Python framework application, which can calculate the probability for each Salmonella serotype.

Bigger picture: 
Salmonella serotyping remains essential for both epidemiological surveillance and clinical management. Between 2016 and 2021, the CDC surveillance system identified 6,110 Salmonella isolates, of which 5,442 (89%) were successfully serotyped.²

Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry offers rapid and cost-effective bacterial identification but typically lacks serotype-level resolution. In their recent study, Ren et al.1 demonstrate the potential of integrating machine learning algorithms with MALDI-TOF spectral data to enable rapid Salmonella serotyping.

While the results are promising, further work is warranted, including validation across a broader range of serotypes and evaluation using different MALDI-TOF platforms. Additionally, diagnostic laboratories may need to expand or curate spectral databases, standardize high-quality sample preparation, and perform local model validation. The successful deployment of an XGBoost-based classifier as a software tool highlights the feasibility of embedding machine learning pipelines into future clinical microbiology workflows.

References: 

  1. Ren et al. (2025).  Automated identification of Salmonella serotype using MALDI-TOF mass spectrometry and machine learning techniques. Journal of Clinical Microbiology.  Vol. 63, Issue 7: e0003725. 
  2. Collins et al. (2022). Preliminary Incidence and Trends of Infections Caused by Pathogens Transmitted Commonly Through Food - Foodborne Diseases Active Surveillance Network, 10 U.S. Sites, 2016–2021. Morbidity and Mortalilty Weekly Report. Vol. 71:1260–1264.