基因组
生物
遗传学
串联重复
微卫星
人口
计算生物学
DNA
进化生物学
基因
社会学
人口学
等位基因
作者
Jan Mrázek,Xiangxue Guo,Apurva S. Shah
标识
DOI:10.1073/pnas.0702412104
摘要
Simple sequence repeats (SSRs) in DNA sequences are composed of tandem iterations of short oligonucleotides and may have functional and/or structural properties that distinguish them from general DNA sequences. They are variable in length because of slip-strand mutations and may also affect local structure of the DNA molecule or the encoded proteins. Long SSRs (LSSRs) are common in eukaryotes but rare in most prokaryotes. In pathogens, SSRs can enhance antigenic variance of the pathogen population in a strategy that counteracts the host immune response. We analyze representations of SSRs in >300 prokaryotic genomes and report significant differences among different prokaryotes as well as among different types of SSRs. LSSRs composed of short oligonucleotides (1-4 bp length, designated LSSR(1-4)) are often found in host-adapted pathogens with reduced genomes that are not known to readily survive in a natural environment outside the host. In contrast, LSSRs composed of longer oligonucleotides (5-11 bp length, designated LSSR(5-11)) are found mostly in nonpathogens and opportunistic pathogens with large genomes. Comparisons among SSRs of different lengths suggest that LSSR(1-4) are likely maintained by selection. This is consistent with the established role of some LSSR(1-4) in enhancing antigenic variance. By contrast, abundance of LSSR(5-11) in some genomes may reflect the SSRs' general tendency to expand rather than their specific role in the organisms' physiology. Differences among genomes in terms of SSR representations and their possible interpretations are discussed.
科研通智能强力驱动
Strongly Powered by AbleSci AI