- There are over 2,600 Salmonella serotypes.
- Different serotypes are associated with distinct disease outcomes, host ranges, and epidemiologic patterns.
- Rapid, accurate serotyping is important for outbreak tracking, epidemiology, and clinical guidance. Faster diagnostics mean faster response.
- Traditional serotyping (agglutination) is slow, laborious, and hard to scale.
- Combining MALDI-TOF MS with machine learning may provide improved serotyping in the future.
Key findings: Ren et al. evaluated MALDI-TOF MS combined with machine learning to identify Salmonella serotypes.1
- 10 different models were trained using 692 isolates from one hospital to identify eight Salmonella serotypes (B, C1, C2/3, D, E, Not A-F, Salmonella Typhimurium, and Salmonella Enteritidis).
- From 192 spectral features initially assessed, 16 features were selected to distinguish between the 8 different serotypes.
- The models were internally and then externally validated on isolates from another hospital.
- The XGBoost model achieved XGBoost had AUC ~0.99 in training, ~0.97 in validation sets; sensitivity & specificity was high,
- To enhance usability, the model was deployed as a Streamlit-based Python framework application, which can calculate the probability for each Salmonella serotype.
Bigger picture:
Salmonella serotyping remains essential for both epidemiological surveillance and clinical management. Between 2016 and 2021, the CDC surveillance system identified 6,110 Salmonella isolates, of which 5,442 (89%) were successfully serotyped.²
Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry offers rapid and cost-effective bacterial identification but typically lacks serotype-level resolution. In their recent study, Ren et al.1 demonstrate the potential of integrating machine learning algorithms with MALDI-TOF spectral data to enable rapid Salmonella serotyping.
While the results are promising, further work is warranted, including validation across a broader range of serotypes and evaluation using different MALDI-TOF platforms. Additionally, diagnostic laboratories may need to expand or curate spectral databases, standardize high-quality sample preparation, and perform local model validation. The successful deployment of an XGBoost-based classifier as a software tool highlights the feasibility of embedding machine learning pipelines into future clinical microbiology workflows.
References:
- Ren et al. (2025). Automated identification of Salmonella serotype using MALDI-TOF mass spectrometry and machine learning techniques. Journal of Clinical Microbiology. Vol. 63, Issue 7: e0003725.
- Collins et al. (2022). Preliminary Incidence and Trends of Infections Caused by Pathogens Transmitted Commonly Through Food - Foodborne Diseases Active Surveillance Network, 10 U.S. Sites, 2016–2021. Morbidity and Mortalilty Weekly Report. Vol. 71:1260–1264.