计算机科学
人工智能
断层(地质)
计算机视觉
地质学
地震学
作者
Jiangran Liu,Rujiang Hao,Jianwei Liang,Haifeng Su
标识
DOI:10.1088/2631-8695/adf0c4
摘要
Abstract To address the problems of Convolutional Neural Networks (CNNs), it is difficult to capture long-distance dependencies owing to the limited receptive fields and the high computational complexity of the Vision Transformer (ViT), this paper proposes a method based on the Continuous Wavelet Transform (CWT) with Vision Mamba (Vim) for gearbox fault diagnosis. First, the vibration signal is converted into a time-frequency image by CWT to fully characterize the time-frequency domain of the fault signal. Subsequently, the time-frequency image is segmented into image patches, which are input into the Vim encoder after linear projection. The encoder is based on the bidirectional scanning mechanism of the state space model (SSM), and realizes efficient parallel computation through local window attention, which effectively captures the global visual context association while avoiding the high computational complexity of ViT global self-attention. Finally, the fault pattern recognition results are output through the softmax classifier. Comparison experiments show that the proposed method has an average fault recognition rate of 99.86%, which is significantly better than that of the traditional CNN and ViT models, and verifies its strong generalization ability and efficiency advantage in complex industrial scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI