3DSMILES-GPT: 3D Molecular Pocket-based Generation with Token-only Large Language Model

安全性令牌 计算机科学 药物发现 分子 强化学习 第一代 生物系统 人工智能 化学 生物 计算机安全 生物化学 社会学 人口学 有机化学 人口
作者
Jike Wang,Hao Luo,Rui Qin,Mingyang Wang,Xiaozhe Wan,Meijing Fang,Odin Zhang,Qiaolin Gou,Qun Su,Chao Shen,Ziyi You,Liwei Liu,Chang‐Yu Hsieh,Tingjun Hou,Yu Kang
出处
期刊:Chemical Science [Royal Society of Chemistry]
被引量:5
标识
DOI:10.1039/d4sc06864e
摘要

The generation of three-dimensional (3D) molecules based on target structures represents a cutting-edge challenge in drug discovery. Many existing approaches often produce molecules with invalid configurations, unphysical conformations, suboptimal drug-like qualities, limited synthesizability, and require extensive generation times. To address these challenges, we present 3DSMILES-GPT, a fully language-model-driven framework for 3D molecular generation that utilizes tokens exclusively. We treat both two-dimensional (2D) and 3D molecular representations as linguistic expressions, combining them through full-dimensional representations and pre-training the model on a vast dataset encompassing tens of millions of drug-like molecules. This token-only approach enables the model to comprehensively understand the 2D and 3D characteristics of large-scale molecules. Subsequently, we fine-tune the model using pair-wise structural data of protein pockets and molecules, followed by reinforcement learning to further optimize the biophysical and chemical properties of the generated molecules. Experimental results demonstrate that 3DSMILES-GPT generates molecules that comprehensively outperform existing methods in terms of binding affinity, drug-likeness (QED), and synthetic accessibility score (SAS). Notably, it achieves a 33% enhancement in the quantitative estimation of QED, meanwhile the binding affinity estimated by Vina docking maintaining its state-of-the-art performance. The generation speed is remarkably fast, with the average time approximately 0.45 seconds per generation, representing a threefold increase over the fastest existing methods. This innovative 3DSMILES-GPT approach has the potential to positively impact the generation of 3D molecules in drug discovery.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
科研通AI2S应助zwenng采纳,获得10
刚刚
Yu完成签到,获得积分10
刚刚
刚刚
艾永涛完成签到,获得积分10
刚刚
无花果应助神勇的罡采纳,获得10
1秒前
元谷雪应助rwSSS采纳,获得10
1秒前
Schwann翠星石完成签到,获得积分10
1秒前
1秒前
shimmer发布了新的文献求助10
1秒前
1秒前
李土豆完成签到,获得积分10
1秒前
收集快乐完成签到 ,获得积分10
1秒前
Mandy完成签到,获得积分10
2秒前
llxie发布了新的文献求助10
2秒前
594zqz完成签到,获得积分10
2秒前
坚守初心完成签到,获得积分10
2秒前
2秒前
路会飞完成签到,获得积分10
2秒前
3秒前
fancyyyy完成签到,获得积分10
3秒前
缥缈幻悲完成签到,获得积分10
3秒前
Ashley发布了新的文献求助10
3秒前
4秒前
李重坤发布了新的文献求助10
4秒前
Rabbit完成签到 ,获得积分10
4秒前
杨立方发布了新的文献求助10
4秒前
香蕉秋蝶完成签到 ,获得积分10
4秒前
fu发布了新的文献求助10
4秒前
4秒前
apex完成签到,获得积分10
5秒前
birdy发布了新的文献求助10
5秒前
6秒前
6秒前
爱吃粑粑发布了新的文献求助10
6秒前
Lucas应助lsn采纳,获得10
6秒前
小太阳完成签到,获得积分10
6秒前
布丁完成签到,获得积分10
6秒前
6秒前
开朗的蚂蚁完成签到,获得积分10
7秒前
阳光孤容完成签到,获得积分20
7秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Modern Epidemiology, Fourth Edition 5000
Kinesiophobia : a new view of chronic pain behavior 5000
Molecular Biology of Cancer: Mechanisms, Targets, and Therapeutics 3000
Weaponeering, Fourth Edition – Two Volume SET 1000
First commercial application of ELCRES™ HTV150A film in Nichicon capacitors for AC-DC inverters: SABIC at PCIM Europe 1000
Handbook of pharmaceutical excipients, Ninth edition 800
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 内科学 生物化学 物理 计算机科学 纳米技术 遗传学 基因 复合材料 化学工程 物理化学 病理 催化作用 免疫学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 5997751
求助须知:如何正确求助?哪些是违规求助? 7483538
关于积分的说明 16084788
捐赠科研通 5141594
什么是DOI,文献DOI怎么找? 2756511
邀请新用户注册赠送积分活动 1731555
关于科研通互助平台的介绍 1630145