Search for a command to run...
Abstract Foodborne non-typhoidal Salmonella remains a major public health concern, yet routine surveillance recovers large numbers of isolates from food that are not associated with human illness. Studies have shown foodborne isolates can be genetically linked to clinical cases, highlighting a critical challenge for risk assessment and outbreak prioritisation. This study aimed to determine whether genomic markers can distinguish foodborne Salmonella strains with an increased likelihood of causing infection. Whole-genome sequencing data from over 900 Salmonella isolates recovered from food and the environment through UK Health Security Agency surveillance were analysed using hierarchical clustering to define genetically related groups. These clusters were expanded using the global EnteroBase database to provide broader epidemiological context. Genome-wide association analyses identified genetic markers associated with clusters containing clinical isolates, including phage-associated regions. A highly conserved 7 kb marker identified in S. Agona demonstrated strong predictive performance at a global scale, with high sensitivity and specificity for infection-associated lineages and strict serovar restriction. Comparative genomic analysis revealed that all markers localised to a shared chromosomal hotspot corresponding to a prophage integration site. The 7 kb risk-associated marker formed part of a larger prophage closely related to the well-characterised S. Typhimurium Fels-2 phage, which encodes a DNA invertase linked to phase variation, a mechanism known to promote phenotypic heterogeneity and host adaptation. As these S. Agona isolates are monophasic, our findings indicate that our genome-wide association approach has rediscovered this DNA invertase known to contribute to infection risk but in a different serovar via an alternative regulatory mechanism. Overall, this work demonstrates the potential to move beyond treating all foodborne Salmonella isolates as equivalent hazards, towards a genomics-informed framework for risk stratification. This approach provides a foundation for improved risk-based decision-making, enhance outbreak investigations and enable earlier prioritisation of public health responses during Salmonella surveillance and control. Author summary Foodborne Salmonella infections remain a major public health concern, but not all strains pose the same risk to human health. Here we investigated whether genetic differences could explain why some foodborne strains are more likely to cause human infection. We analysed over 900 genomes from food and environmental sources, grouping closely related strains before placing them in a global context using EnteroBase. By combining pangenome and genome-wide association analyses, we identified distinct lineages within several serovars that differed in their association with human cases. In Salmonella Agona, all clinical isolates belonged to a single lineage carrying a highly conserved 7 kb marker that was absent from low-risk strains. This marker demonstrated strong sensitivity and specificity across global datasets and was located within a prophage closely related to the well-characterised Fels-2 phage. This region encodes a DNA invertase previously linked to phase variation, a mechanism that promotes bacterial adaptability. Our findings indicate that infection risk can be structured at the lineage level and influenced by mobile genomic elements, particularly prophages, that enhance environmental persistence and host adaptation. This work advances genomic surveillance from retrospective linkage towards mechanistic and predictive risk assessment, with direct relevance for supporting risk-based decision-making during outbreak investigations.