Machine-Learning-Based Data Analysis Method for Cell-Based Selection of DNA-Encoded Libraries

计算机科学 选择(遗传算法) 概率逻辑 功能(生物学) 人工智能 公制(单位) 机器学习 数据挖掘 计算生物学 生物 工程类 运营管理 进化生物学
作者
Rui Hou,Changsheng Xie,Yuhan Gui,Gang Li,Xiaoyu Li
出处
期刊:ACS omega [American Chemical Society]
卷期号:8 (21): 19057-19071 被引量:2
标识
DOI:10.1021/acsomega.3c02152
摘要

DNA-encoded library (DEL) is a powerful ligand discovery technology that has been widely adopted in the pharmaceutical industry. DEL selections are typically performed with a purified protein target immobilized on a matrix or in solution phase. Recently, DELs have also been used to interrogate the targets in the complex biological environment, such as membrane proteins on live cells. However, due to the complex landscape of the cell surface, the selection inevitably involves significant nonspecific interactions, and the selection data are much noisier than the ones with purified proteins, making reliable hit identification highly challenging. Researchers have developed several approaches to denoise DEL datasets, but it remains unclear whether they are suitable for cell-based DEL selections. Here, we report the proof-of-principle of a new machine-learning (ML)-based approach to process cell-based DEL selection datasets by using a Maximum A Posteriori (MAP) estimation loss function, a probabilistic framework that can account for and quantify uncertainties of noisy data. We applied the approach to a DEL selection dataset, where a library of 7,721,415 compounds was selected against a purified carbonic anhydrase 2 (CA-2) and a cell line expressing the membrane protein carbonic anhydrase 12 (CA-12). The extended-connectivity fingerprint (ECFP)-based regression model using the MAP loss function was able to identify true binders and also reliable structure-activity relationship (SAR) from the noisy cell-based selection datasets. In addition, the regularized enrichment metric (known as MAP enrichment) could also be calculated directly without involving the specific machine-learning model, effectively suppressing low-confidence outliers and enhancing the signal-to-noise ratio. Future applications of this method will focus on de novo ligand discovery from cell-based DEL selections.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
ccc发布了新的文献求助10
1秒前
哈哈哈哈完成签到 ,获得积分10
2秒前
2秒前
3秒前
星辰大海应助Yola采纳,获得10
3秒前
3秒前
领导范儿应助鱼吃采纳,获得10
4秒前
5秒前
6秒前
7秒前
淘金者1314完成签到 ,获得积分10
7秒前
是小明啦发布了新的文献求助10
7秒前
Mike001发布了新的文献求助10
7秒前
8秒前
Mike001发布了新的文献求助10
9秒前
9秒前
9秒前
10秒前
Mike001发布了新的文献求助30
10秒前
Mike001发布了新的文献求助50
12秒前
Mike001发布了新的文献求助10
13秒前
acuter发布了新的文献求助10
15秒前
Mike001发布了新的文献求助10
15秒前
嗯哼发布了新的文献求助10
15秒前
我是老大应助jiangmingjiao采纳,获得10
15秒前
Mike001发布了新的文献求助10
16秒前
完美世界应助沉默的谷秋采纳,获得10
16秒前
zhu发布了新的文献求助100
17秒前
无心的行云完成签到,获得积分10
17秒前
18秒前
Mike001发布了新的文献求助10
18秒前
19秒前
Mike001发布了新的文献求助10
20秒前
彭于晏应助pp若若gg采纳,获得10
20秒前
bkagyin应助ANK采纳,获得10
20秒前
小熊软糖完成签到 ,获得积分10
21秒前
跌跌撞撞发布了新的文献求助10
21秒前
Mike001发布了新的文献求助10
21秒前
salvage完成签到,获得积分10
22秒前
24秒前
高分求助中
The three stars each : the Astrolabes and related texts 1070
Manual of Clinical Microbiology, 4 Volume Set (ASM Books) 13th Edition 1000
Sport in der Antike 800
Aspect and Predication: The Semantics of Argument Structure 666
De arte gymnastica. The art of gymnastics 600
少脉山油柑叶的化学成分研究 530
Berns Ziesemer - Maos deutscher Topagent: Wie China die Bundesrepublik eroberte 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2410650
求助须知:如何正确求助?哪些是违规求助? 2106062
关于积分的说明 5320836
捐赠科研通 1833494
什么是DOI,文献DOI怎么找? 913602
版权声明 560840
科研通“疑难数据库(出版商)”最低求助积分说明 488530