计算机科学
人工智能
瓶颈
体素
特征提取
事件(粒子物理)
模式识别(心理学)
变压器
源代码
数据挖掘
物理
量子力学
电压
嵌入式系统
操作系统
作者
Chunrong Yuan,Yu J,Zongzhen Wu,Fangjie Wei,Yangzirui Wang,Li Hui Chen,Xiao Wang
标识
DOI:10.1007/978-981-99-8429-9_1
摘要
Recognizing target objects using an event-based camera draws more and more attention in recent years. Existing works usually represent the event streams into point-cloud, voxel, image, etc., and learn the feature representations using various deep neural networks. Their final results may be limited by the following factors: monotonous modal expressions and the design of the network structure. To address the aforementioned challenges, this paper proposes a novel dual-stream framework for event representation, extraction, and fusion. This framework simultaneously models two common representations: event images and event voxels. By utilizing Transformer and Structured Graph Neural Network (GNN) architectures, spatial information and three-dimensional stereo information can be learned separately. Additionally, a bottleneck Transformer is introduced to facilitate the fusion of the dual-stream information. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art performance on two widely used event-based classification datasets. The source code of this work is available at: https://github.com/Event-AHU/EFV_event_classification .
科研通智能强力驱动
Strongly Powered by AbleSci AI