计算机科学
领域(数学)
感受野
人工智能
人机交互
计算机视觉
数学
纯数学
标识
DOI:10.1016/j.eswa.2024.123359
摘要
Long-range dependence modeling has been demonstrated to be an effective technique for boosting the performance of channelwise attention. As a state-of-the-art attention mechanism and a lightweight model, coordinate attention (CA) captures long-range dependencies in its context modeling phase. However, its subsequent transformation phase can only capture interchannel dependencies. This leads to a linear receptive field in overall dependency modeling scenarios. To further expand the receptive field of CA, a novel attention mechanism called patch-enhanced attention (PEA) is proposed in this paper. To redesign the transformation phase, a group of unfolding and folding operations are embedded into the thin waist of the hourglass structure. In this way, the receptive field range for coordinate orientations is expanded from a line to a patch. Extensive experiments conducted on the ImageNet and Pascal VOC benchmarks validate the effectiveness of the proposed PEA mechanism. Compared with CA, PEA achieves state-of-the-art performance with fewer parameters.
科研通智能强力驱动
Strongly Powered by AbleSci AI