清脆的
计算机科学
生物信息学
人工智能
口译(哲学)
优先次序
班级(哲学)
机器学习
计算生物学
语言模型
利用
合成生物学
合成数据
集合(抽象数据类型)
遗传模型
脆弱性(计算)
上位性
生物
遗传算法
作者
Aurél Prósz,Zsófia Sztupinszki,Miklós Dióssy,Bogumil Zimon,Istvan Gyorgy Csabai,Zoltán Szállási
标识
DOI:10.64898/2026.01.28.702211
摘要
ABSTRACT Identifying clinically relevant synthetic lethal interactions has great potential for uncovering novel therapeutic vulnerabilities in cancer. Current approaches rely on machine learning models that estimate probabilities of synthetic lethal interactions, without supplying explicit knowledge of the underlying biology and lack the human-readable interpretation leading to the prediction. Large Language Models (LLMs) represent a new class of tools capable of reasoning and leveraging extensive biological knowledge acquired from relevant literature during their pretraining. Here, we tested multiple open-weight LLMs for their ability to predict known and novel synthetic lethal interactions. We found that most of the tested models were better at reconstructing the results of three known genome-wide CRISPR knockout screens than random chance, while observed that their performance was related to the parameter-size of the model, and on average benefited little from additional pathway and genetic information apart from what they already possess when estimating the likelihood of a synthetic lethal relationship. After selecting the best-performing and most computationally efficient model for our use case (Qwen2.5-32B-Instruct, 0.715 AUROC), we performed an in silico screen of 398,277 gene pairs from 893 clinically relevant genes. Our goal was to highlight the potential of open-weights LLMs as scalable, context-aware prioritization tools for synthetic lethal interactions, and to lay the groundwork for predicting higher-order genetic interactions.
科研通智能强力驱动
Strongly Powered by AbleSci AI