水准点(测量)
任务(项目管理)
计算机科学
聚糖
机器学习
人工智能
多任务学习
化学
工程类
地理
地图学
系统工程
糖蛋白
生物化学
作者
Minghao Xu,Yunteng Geng,Yihang Zhang,Ling Yang,J. R. Tang,Wentao Zhang
出处
期刊:Cornell University - arXiv
日期:2024-05-25
被引量:1
标识
DOI:10.48550/arxiv.2405.16206
摘要
Glycans are basic biomolecules and perform essential functions within living organisms. The rapid increase of functional glycan data provides a good opportunity for machine learning solutions to glycan understanding. However, there still lacks a standard machine learning benchmark for glycan property and function prediction. In this work, we fill this blank by building a comprehensive benchmark for Glycan Machine Learning (GlycanML). The GlycanML benchmark consists of diverse types of tasks including glycan taxonomy prediction, glycan immunogenicity prediction, glycosylation type prediction, and protein-glycan interaction prediction. Glycans can be represented by both sequences and graphs in GlycanML, which enables us to extensively evaluate sequence-based models and graph neural networks (GNNs) on benchmark tasks. Furthermore, by concurrently performing eight glycan taxonomy prediction tasks, we introduce the GlycanML-MTL testbed for multi-task learning (MTL) algorithms. Also, we evaluate how taxonomy prediction can boost other three function prediction tasks by MTL. Experimental results show the superiority of modeling glycans with multi-relational GNNs, and suitable MTL methods can further boost model performance. We provide all datasets and source codes at https://github.com/GlycanML/GlycanML and maintain a leaderboard at https://GlycanML.github.io/project
科研通智能强力驱动
Strongly Powered by AbleSci AI