An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits

语音识别 计算机科学 感觉系统 丘脑 人工智能 选择性听觉注意 人工神经网络 神经科学 选择性注意 心理学 认知
作者
Kai Li,Fenghua Xie,Hang Chen,Kexin Yuan,Xiaolin Hu
出处
期刊:IEEE Transactions on Pattern Analysis and Machine Intelligence [IEEE Computer Society]
卷期号:46 (10): 6637-6651 被引量:7
标识
DOI:10.1109/tpami.2024.3384034
摘要

Audio-visual approaches involving visual inputs have laid the foundation for recent progress in speech separation. However, the optimization of the concurrent usage of auditory and visual inputs is still an active research area. Inspired by the cortico-thalamo-cortical circuit, in which the sensory processing mechanisms of different modalities modulate one another via the non-lemniscal sensory thalamus, we propose a novel cortico-thalamo-cortical neural network (CTCNet) for audio-visual speech separation (AVSS). First, the CTCNet learns hierarchical auditory and visual representations in a bottom-up manner in separate auditory and visual subnetworks, mimicking the functions of the auditory and visual cortical areas. Then, inspired by the large number of connections between cortical regions and the thalamus, the model fuses the auditory and visual information in a thalamic subnetwork through top-down connections. Finally, the model transmits this fused information back to the auditory and visual subnetworks, and the above process is repeated several times. The results of experiments on three speech separation benchmark datasets show that CTCNet remarkably outperforms existing AVSS methods with considerably fewer parameters. These results suggest that mimicking the anatomical connectome of the mammalian brain has great potential for advancing the development of deep neural networks.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
憨憨发布了新的文献求助10
刚刚
叮咚完成签到 ,获得积分10
1秒前
Ava应助风清扬采纳,获得10
1秒前
温婉的翎发布了新的文献求助10
1秒前
科研通AI6.3应助SAMCHU采纳,获得30
1秒前
2秒前
谭烨琦发布了新的文献求助10
3秒前
3秒前
梦里吃早饭完成签到,获得积分10
3秒前
哈哈哈哈哈完成签到,获得积分20
4秒前
5秒前
5秒前
二十一日完成签到,获得积分10
5秒前
hhhhhhhh发布了新的文献求助10
5秒前
Zzzzbbbyy完成签到,获得积分10
6秒前
6秒前
花园发布了新的文献求助10
7秒前
7秒前
8秒前
8秒前
NexusExplorer应助落寞无血采纳,获得10
8秒前
th发布了新的文献求助10
8秒前
机智铃铛完成签到,获得积分10
9秒前
酷波er应助钱都来采纳,获得10
9秒前
奶茶田田完成签到 ,获得积分10
9秒前
白青完成签到,获得积分10
9秒前
10秒前
10秒前
obca发布了新的文献求助10
10秒前
三七二一完成签到,获得积分10
11秒前
二十一日发布了新的文献求助10
11秒前
读者发布了新的文献求助20
11秒前
12秒前
12秒前
15秒前
101022发布了新的文献求助10
15秒前
机智的亦竹完成签到,获得积分10
15秒前
15秒前
16秒前
高分求助中
Adhesion Science: Principles & Practice 1234
Signals, Systems, and Signal Processing 610
Burger's Medicinal Chemistry and Drug Discovery 400
A Step-by-Step Guide to Qualitative Data Coding 2nd Edition 400
Impact of Storage Orientation and Duration on Prefilled Syringe Performance: Break-Loose and Glide Forces, and Injection Time Across Multiple Time Points 360
Programming for Chemical Engineers Using C, C++, and MATLAB 300
Upland Kenya wild flowers and ferns: a flora of the flowers, ferns, grasses, and sedges of highland Kenya 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6669639
求助须知:如何正确求助?哪些是违规求助? 8418306
关于积分的说明 17995353
捐赠科研通 5879020
什么是DOI,文献DOI怎么找? 2977276
邀请新用户注册赠送积分活动 1953185
关于科研通互助平台的介绍 1881927