计算机科学
分散注意力
人工智能
机器学习
计算机视觉
模式识别(心理学)
认知心理学
心理学
作者
Yingzhi Zhang,Taiguo Li,Chao Li,Xinghong Zhou
标识
DOI:10.1109/jiot.2023.3308921
摘要
Driver distraction causes a significant number of traffic accidents every year, resulting in economic losses and casualties. Currently, the level of automation in commercial vehicles is far from completely unmanned, and drivers still play an important role in operating and controlling the vehicle. Therefore, driver distraction behavior detection is crucial for road safety. Presently, driver distraction detection primarily relies on traditional convolutional neural networks (CNNs) and supervised learning methods. However, there are still challenges, such as the high cost of labeled data sets, limited ability to capture high-level semantic information, and weak generalization performance. In order to solve these problems, this article proposes a new self-supervised learning method based on masked image modeling (MIM) for driver distraction behavior detection. First, a self-supervised learning framework for MIM is introduced to solve the serious human and material consumption issues caused by data set labeling. Second, the Swin Transformer is employed as an encoder. Performance is enhanced by reconfiguring the Swin Transformer block and adjusting the distribution of the number of window multihead self-attention (W-MSA) and shifted W-MSA (SW-MSA) detection heads across all stages, which leads to model more lightening. Finally, various data augmentation strategies are used along with the best random masking strategy to strengthen the model's recognition and generalization ability. Test results on a large-scale driver distraction behavior data set show that the self-supervised learning method proposed in this article achieves an accuracy of 99.60%, approximating the excellent performance of advanced supervised learning methods. Our code is publicly available at github.com/Rocky1salady-killer/SL-DDBD.
科研通智能强力驱动
Strongly Powered by AbleSci AI