因果推理
推论
计算机科学
杠杆(统计)
反事实思维
机器学习
人工智能
背景(考古学)
归属
混淆
特征(语言学)
数据挖掘
数据科学
计量经济学
生物
心理学
数学
社会心理学
古生物学
语言学
统计
哲学
作者
Payam Dibaeinia,Abhishek Ojha,Saurabh Sinha
出处
期刊:Science Advances
[American Association for the Advancement of Science]
日期:2025-02-14
卷期号:11 (7)
标识
DOI:10.1126/sciadv.adk0837
摘要
The discovery of molecular relationships from high-dimensional data is a major open problem in bioinformatics. Machine learning and feature attribution models have shown great promise in this context but lack causal interpretation. Here, we show that a popular feature attribution model, under certain assumptions, estimates an average of a causal quantity reflecting the direct influence of one variable on another. We leverage this insight to propose a precise definition of a gene regulatory relationship and implement a new tool, CIMLA (Counterfactual Inference by Machine Learning and Attribution Models), to identify differences in gene regulatory networks between biological conditions, a problem that has received great attention in recent years. Using extensive benchmarking on simulated data, we show that CIMLA is more robust to confounding variables and is more accurate than leading methods. Last, we use CIMLA to analyze a previously published single-cell RNA sequencing dataset from subjects with and without Alzheimer’s disease (AD), discovering several potential regulators of AD.
科研通智能强力驱动
Strongly Powered by AbleSci AI