Tool Identifies Collinear Genes with Strand, Genome Duplication

Beijing Zhongke Journal Publising Co. Ltd.

For quota_Anchor, collinear gene pairs are initially identified by a dynamic programming algorithm analogous to those implemented in DAGchainer and MCScanX. The algorithm then identifies the highest-scoring block and determines the number of query and reference genes within that block, taking into account the alignment depth constraint. This iterative process continues until the score of the collinear block falls below the predefined minimum threshold (default: 3), at which point the iteration terminates.

The authors hypothesized that inversion alters the regulatory context of collinear genes and that

relatively inverted gene pairs exhibit less similar expression patterns. To validate this hypothesis, the authors used transcriptome data from a consistent developmental stage (the four-leaf stage) across three species: wheat, maize, and sorghum and performed pairwise species comparisons among the three species while controlling for alignment depth using quota_Anchor to consider only collinear orthologous gene pairs. Among collinear orthologous gene pairs arising from species divergence, gene pairs were classified into two categories: those that underwent relative inversion with respect to their collinear blocks (relatively inverted gene pairs, RIGPs) and those that did not (non-relatively inverted gene pairs, NRIGPs). Correlation analysis of these two sets revealed that NRIGPs exhibited more conserved expression patterns than RIGPs.

Similarly, the authors calculated synonymous substitution rates (Ks) and nonsynonymous-to-synonymous substitution ratios (Ka/Ks) for gene pairs in two sets: non-inverted gene pairs (NRIGPs) and relatively inverted gene pairs (RIGPs). These calculations were based on collinear orthologous gene pairs of 27 Poaceae species relative to the outgroup Joinvillea ascendens. The results indicate that RIGPs exhibit significantly lower Ka/Ks ratios than NRIGPs, suggesting that relatively inverted genes are under stronger purifying selection. In addition, RIGPs show higher Ks values and greater standard deviation compared to NRIGPs. Given that Ks is widely used to estimate the timing of species divergence and whole-genome duplication (WGD) events, the increased variability in RIGP Ks values implies reduced reliability for dating such events. Therefore, when analyzing WGD and species divergence events based on Ks values, it might be more representative to use only NRIGP.

Finally, the authors present two application examples and summarize the tool's limitations. In summary, the authors developed a collinear gene identification tool with strict control over alignment depth and explored the differences between inverted and non-inverted genes within collinear blocks.

See the article:

quota_Anchor: a strand and whole genome duplication–aware collinear gene identification tool

https://www.sciencedirect.com/science/article/pii/S2662173825002152

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.