简单(哲学)
摄动(天文学)
计算机科学
人工智能
计量经济学
计算生物学
数学
统计物理学
机器学习
算法
生物
物理
哲学
认识论
量子力学
作者
Constantin Ahlmann-Eltze,Wolfgang Huber,Simon Anders
标识
DOI:10.1101/2024.09.16.613342
摘要
Abstract Advanced deep-learning methods, such as foundation models, promise to learn representations of biology that can be employed to predict in silico the outcome of unseen experiments, such as the effect of genetic perturbations on the transcriptomes of human cells. To see whether current models already reach this goal, we benchmarked five foundation models and two other deep learning models against deliberately simplistic linear baselines. For combinatorial perturbations of two genes for which only the individual single perturbations had been seen, we find that the deep learning-based approaches did not perform better than a simple additive model. For perturbations of genes that had not yet been seen, the deep learning-based approaches did not outper-form the baseline of predicting the mean across the training perturbations. We hypothesize that the poor performance is partially because the pre-training data is observational; we show that a simple linear model reliably outperforms all other models when pre-trained on another perturbation dataset. While the promise of deep neural networks for the representation of biological systems and prediction of experimental outcomes is plausible, our work highlights the need for clear setting of objectives and for critical benchmarking to direct research efforts. Contact constantin.ahlmann@embl.de
科研通智能强力驱动
Strongly Powered by AbleSci AI