计算机科学
肽
情报检索
计算生物学
数据挖掘
化学
数据科学
生物化学
生物
作者
Andrew Dickson White,Andrew J. Keefe,Ann K. Nowinski,Qing Shao,K. B. Caldwell,Shaoyi Jiang
摘要
Peptide libraries allow researchers to quickly find hundreds of peptide sequences with a desired property. Currently, the large amount of data generated from peptide libraries is analyzed by hand, where researchers search for repeating patterns in the peptide sequences. Such patterns are called motifs. In this work, we describe a set of algorithms which allow quick, efficient, and standard analysis of peptide libraries. Four main techniques are described: (1) choice of the number of motifs present in a peptide library; (2) separation of the peptides into groups of similar sequences; (3) fitting of a model to the peptides to extract motifs; (4) analysis of the library using quantitative structure-property relationships if no clear motifs are present. The application of five previously published data sets shows these techniques can automatically repeat the work of experts quickly and allow much more flexibility in analysis. A new way of visually presenting peptide libraries is also described, which allows visual inspection of the grouping and spread of sequences. The algorithms have been implemented in an open-source plug-in called "peplib" and an online web application.
科研通智能强力驱动
Strongly Powered by AbleSci AI