Lv1
30 积分 2024-01-10 加入
Retaining Temporal Semantics and Relation Topologies for Continual Weakly-Supervised Audio-Visual Video Parsing
2小时前
求助中
GRiT: A Generative Region-to-Text Transformer for Object Understanding
13天前
已关闭
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization
1个月前
已完结
MEDMCN: a novel multi-modal EfficientDet with multi-scale CapsNet for object detection
1个月前
已完结
Audio-Visual Event Localization with Cross Co-Attention and Dynamic Audio-Object Semantic Alignment
1个月前
已完结
Segment-level event perception with semantic dictionary for weakly supervised audio-visual video parsing
2个月前
已完结
Toward a perceptive pretraining framework for Audio-Visual Video Parsing
2个月前
已完结
Conv1D-LSTM: Autonomous Breast Cancer Detection Using a One-Dimensional Convolutional Neural Network with Long Short-Term Memory
2个月前
已完结
LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport
2个月前
已完结
Toward a perceptive pretraining framework for Audio-Visual Video Parsing
2个月前
已完结