人群
提取器
水准点(测量)
计算机科学
人工智能
特征(语言学)
特征提取
深度学习
人工神经网络
模式识别(心理学)
机器学习
工程类
计算机安全
哲学
大地测量学
语言学
地理
工艺工程
作者
Mohamed Mostafa Soliman,Mohamed Hussein Kamal,Mina Abd El-Massih Nashed,Youssef Mohamed Mostafa,Bassel S. Chawky,Dina Khattab
标识
DOI:10.1109/icicis46948.2019.9014714
摘要
Automatic recognition of violence between individuals or crowds in videos has a broad interest. In this work, an end-to-end deep neural network model for the purpose of recognizing violence in videos is proposed. The proposed model uses a pre-trained VGG-16 on ImageNet as spatial feature extractor followed by Long Short-Term Memory (LSTM) as temporal feature extractor and sequence of fully connected layers for classification purpose. The achieved accuracy is near state-of-the-art. Also, we contribute by introducing a new benchmark called Real- Life Violence Situations which contains 2000 short videos divided into 1000 violence videos and 1000 non-violence videos. The new benchmark is used for fine-tuning the proposed models achieving a best accuracy of 88.2%.
科研通智能强力驱动
Strongly Powered by AbleSci AI