Search for a command to run...
Comparative analysis of sequence similarity distributions reveals evolutionary mechanisms shaping gene families. In Salmonidae, whole-genome duplication (WGD) and rapid speciation pose a challenge for modeling retained homologs and sequence divergence. We introduce a stochastic branching-process framework that models sequence similarity decay over evolutionary time and quantifies fractionation rates across successive duplication events. We derive moment-generating functions of pairwise similarity scores and carry out simulation-based validation. Applying our model to multiple salmonid genomes (Atlantic salmon, rainbow trout, Chinook salmon, …), we not only recapitulate observed bimodal similarity distributions, but we also quantify gene retention across evolutionary branches. Results indicate that the estimated fractionation rates for both WGDs (<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>μ</mml:mi></mml:mrow><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>μ</mml:mi></mml:mrow><mml:mn>2</mml:mn></mml:msub></mml:mrow><mml:mo>≈</mml:mo><mml:mn>0.0009</mml:mn></mml:mrow></mml:math>-0.0013 per Myr) remain highly consistent across species and insensitive to synteny block size, supporting a conserved post-WGD gene loss dynamic. In contrast, lineage-specific differences in duplicate retention arise primarily in the temporal gap between duplication events rather than differences in instantaneous loss rates. These findings underscore the stability of fractionation dynamics and the critical role of structural genome decay in shaping retention patterns in salmonid evolution and sucker fish.