着丝粒
成对比较
序列(生物学)
生物
多序列比对
序列比对
计算机科学
串联重复
史密斯-沃特曼算法
仿射变换
计算生物学
基因组
人工智能
染色体
遗传学
数学
基因
肽序列
纯数学
作者
Pinglu Zhang,Yanming Wei,Qinzhong Tian,Quan Zou,Yansu Wang
出处
期刊:Genome Research
[Cold Spring Harbor Laboratory Press]
日期:2025-02-12
卷期号:: gr.279763.124-gr.279763.124
标识
DOI:10.1101/gr.279763.124
摘要
The release of the first draft of the human pangenome has revolutionized genomic research by enabling access to complex regions like centromeres, composed of extra-long tandem repeats (ETRs). However, a significant gap remains as current methodologies are inadequate for producing sequence alignments that effectively capture genetic events within ETRs, highlighting a pressing need for improved alignment tools. Inspired by UniAligner, we develope Rare Match Aligner (RaMA), using rare matches as anchors and 2-piece affine gap cost to generate complete pairwise alignment that better capture genetic evolution. RaMA also employs parallel computing and the wavefront algorithm to accelerate anchor discovery and sequence alignment, achieving up to 13.66 times faster processing and using only 11% of UniAligner's memory. Downstream analysis of simulated data and the CHM13 and CHM1 Higher Order Repeat (HOR) arrays demonstrates that RaMA achieves more accurate alignment, effectively capturing true HOR structures. RaMA also introduces two methods for defining reliable alignment regions, further refining and enhancing the accuracy of centromeric alignment statistics.
科研通智能强力驱动
Strongly Powered by AbleSci AI