奥卡姆
计算机科学
人工神经网络
一般化
频道(广播)
正规化(语言学)
规范(哲学)
人工智能
算法
矩阵范数
信道容量
基质(化学分析)
数学优化
理论计算机科学
数学
特征向量
电信
法学
程序设计语言
材料科学
物理
复合材料
数学分析
量子力学
政治学
标识
DOI:10.1007/978-3-030-86380-7_21
摘要
Occam’s Razor principle suggests preference for simpler models and triggers an enduring question: what is the proper definition of complexity of a model? In this work, we regard neural networks as communication channels and measure the complexity of neural networks by means of their channel capacity—the maximum information reserved in the output of a neural network. Furthermore, we show a connection between the L2-norm of the weight matrix of the linear model and its channel capacity through the singular values of the weight matrix. On image classification problems, we find regularizing different neural networks by constraining their channel capacity effectively boosts the generalization performance and outperforms other information-theoretic regularization methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI