情态动词
财产(哲学)
链条(单位)
计算机科学
融合
自然语言处理
人工智能
化学
语言学
哲学
物理
认识论
天文
高分子化学
作者
Chang Jin,Siyuan Guo,Shuigeng Zhou,Jihong Guan
标识
DOI:10.1021/acs.jcim.5c00577
摘要
Molecular property prediction (MPP) plays a critical role in drug design and discovery. Due to the multimodal nature of molecular data (e.g., 1D SMILES strings and 2D molecular graphs), multimodal information fusion can generally achieve better performance than using single-modality molecular data. On the other hand, with the rise of large language models (LLMs), increasing efforts have been made to leverage LLMs for molecular property prediction. However, existing works usually use one or two modalities of molecular data, employ simple techniques for multimodal information integration such as straightforward concatenation and summation, and cannot comprehensively exploit the rich complementary information across multiple modalities. Furthermore, these models are typically designed for general chemical tasks, making their performance in molecular property prediction suboptimal. Worse still, they cannot provide explainable results, which are important for drug design and discovery-related tasks including MPP. In response to these limitations of existing works, this paper presents LLM-MPP, a new, effective, and explainable LLM-driven multimodal method for drug molecular property prediction, which leverages 1D SMILES strings, 2D molecular graph structures, and molecular textual descriptions of molecular properties as training data. By incorporating the chain-of-thought (CoT) technique, we enhance the interpretability and transparency of the proposed method, while promoting alignment and feature extraction across multiple modalities. Cross-attention and contrastive learning are adopted to effectively fuse multimodal molecular representations for property prediction. Experiments on nine benchmark data sets for molecular property prediction demonstrate that our method achieves state-of-the-art performance on 5 and ranks second on 1 of the 9 data sets, surpassing 22 existing baselines. Ablation experiments validate the effectiveness of our innovative modules, effectively addressing the limitations of existing models.
科研通智能强力驱动
Strongly Powered by AbleSci AI