可解释性
序列(生物学)
计算生物学
困惑
计算机科学
肽
生物
人工智能
遗传学
生物化学
语言模型
作者
Zhongshan Luo,Aoyun Geng,Leyi Wei,Q. Zou,Feifei Cui,Zilong Zhang
出处
期刊:Advanced Science
[Wiley]
日期:2025-04-15
卷期号:12 (20): e2412926-e2412926
被引量:6
标识
DOI:10.1002/advs.202412926
摘要
Abstract Peptides are recognized as next‐generation therapeutic drugs due to their unique properties and are essential for treating human diseases. In recent years, a number of deep generation models for generating peptides have been proposed and have shown great potential. However, these models cannot well control the length of the generated sequence, while the sequence length has a very important impact on the physical and chemical properties and therapeutic effects of peptides. Here, a diffusion model is introduced, capable of controlling the length of generated functional peptide sequences, named CPL‐Diff. CPL‐Diff can control the length of generated polypeptide sequences using only attention masking. Additionally, CPL‐Diff can generate single‐functional polypeptide sequences based on given conditional information. Experiments demonstrate that the peptides generated by CPL‐Diff exhibit lower perplexity and similarity compared to those produced by the current state‐of‐the‐art models, and further exhibit relevant physicochemical properties similar to real sequences. The interpretability analysis is also performed on CPL‐Diff to understand how it controls the length of generated sequences and the decision‐making process involved in generating polypeptide sequences, with the aim of providing important theoretical guidance for polypeptide design. The code for CPL‐Diff is available at https://github.com/luozhenjie1997/CPL‐Diff .
科研通智能强力驱动
Strongly Powered by AbleSci AI