可解释性
计算机科学
卷积神经网络
稳健性(进化)
人工智能
指数增长
序列母题
模式识别(心理学)
机器学习
数学
生物
DNA
基因
遗传学
数学分析
作者
Peter K. Koo,Matt Ploenzke
标识
DOI:10.1038/s42256-020-00291-x
摘要
Deep convolutional neural networks (CNNs) trained on regulatory genomic sequences tend to build representations in a distributed manner, making it a challenge to extract learned features that are biologically meaningful, such as sequence motifs. Here we perform a comprehensive analysis of synthetic sequences to investigate the role that CNN activations have on model interpretability. We show that employing an exponential activation in the first layer filters consistently leads to interpretable and robust representations of motifs compared with other commonly used activations. Strikingly, we demonstrate that CNNs with better test performance do not necessarily imply more interpretable representations with attribution methods. We find that CNNs with exponential activations significantly improve the efficacy of recovering biologically meaningful representations with attribution methods. We demonstrate that these results generalize to real DNA sequences across several in vivo datasets. Together, this work demonstrates how a small modification to existing CNNs (that is, setting exponential activations in the first layer) can substantially improve the robustness and interpretabilty of learned representations directly in convolutional filters and indirectly with attribution methods. Model interpretability is important in genomics. Koo and Ploenzke show that exponential activations in the first layer of convolutional neural networks lead to interpretable and robust representations of genomic sequence motifs.
科研通智能强力驱动
Strongly Powered by AbleSci AI