计算机科学
期限(时间)
特征(语言学)
融合
人工智能
模式识别(心理学)
异常检测
计算机视觉
物理
语言学
量子力学
哲学
作者
Sareer Ul Amin,Bumsoo Kim,Yong-Hoon Jung,Sanghyun Seo,Sangoh Park
标识
DOI:10.1002/aisy.202300706
摘要
Surveillance cameras produce vast amounts of video data, posing a challenge for analysts due to the infrequent occurrence of unusual events. To address this, intelligent surveillance systems leverage AI and computer vision to automatically detect anomalies. This study proposes an innovative method combining 3D convolutions and long short‐term memory (LSTM) modules to capture spatiotemporal features in video data. Notably, a structured coarse‐level feature fusion mechanism enhances generalization and mitigates the issue of vanishing gradients. Unlike traditional convolutional neural networks, the approach employs depth‐wise feature stacking, reducing computational complexity and enhancing the architecture. Additionally, it integrates microautoencoder blocks for downsampling, eliminates the computational load of ConvLSTM2D layers, and employs frequent feature concatenation blocks during upsampling to preserve temporal information. Integrating a Conv‐LSTM module at the down‐ and upsampling stages enhances the model's ability to capture short‐ and long‐term temporal features, resulting in a 42‐layer network while maintaining robust performance. Experimental results demonstrate significant reductions in false alarms and improved accuracy compared to contemporary methods, with enhancements of 2.7%, 0.6%, and 3.4% on the UCSDPed1, UCSDPed2, and Avenue datasets, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI