TY - GEN
T1 - An Efficient Violence Detection Approach for Smart Cities Surveillance System
AU - Khan, Mustaqeem
AU - Gueaieb, Wail
AU - Saddik, Abdulmotaleb El
AU - De Masi, Giulia
AU - Karray, Fakhri
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Detecting violence in surveillance videos is crucial in activity recognition, with wide-ranging applications in unmanned aerial vehicles (UAVs), internet video filtering, and related domains. This study proposed a highly effective deep learning architecture that employs a two-stream approach, combining a 3D convolution network with a merging module for violence detection. One stream analyzes RGB frames with suppressed background, while the other focuses on the optical flow between corresponding frames. These inputs are precious in identifying violent actions often characterized by distinctive body movements. To ensure robust long-range feature extraction with fewer parameters, we replace the 3D depth-wise convolution operation at each layer instead of the conventional 3D. Our model outperforms existing methods on challenging datasets such as RWF2000, Real-Life Violence Situation (RLVS), and Movie Fight, securing state-of-the-art results. Our experiments demonstrate that the proposed model is well-suited for edge devices, offering computational efficiency and precise detection capabilities.
AB - Detecting violence in surveillance videos is crucial in activity recognition, with wide-ranging applications in unmanned aerial vehicles (UAVs), internet video filtering, and related domains. This study proposed a highly effective deep learning architecture that employs a two-stream approach, combining a 3D convolution network with a merging module for violence detection. One stream analyzes RGB frames with suppressed background, while the other focuses on the optical flow between corresponding frames. These inputs are precious in identifying violent actions often characterized by distinctive body movements. To ensure robust long-range feature extraction with fewer parameters, we replace the 3D depth-wise convolution operation at each layer instead of the conventional 3D. Our model outperforms existing methods on challenging datasets such as RWF2000, Real-Life Violence Situation (RLVS), and Movie Fight, securing state-of-the-art results. Our experiments demonstrate that the proposed model is well-suited for edge devices, offering computational efficiency and precise detection capabilities.
KW - Deep Learning
KW - Smart City
KW - Surveillance System
KW - Video Analysis
KW - Violence Detection
UR - http://www.scopus.com/inward/record.url?scp=85178340192&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85178340192&partnerID=8YFLogxK
U2 - 10.1109/ISC257844.2023.10293696
DO - 10.1109/ISC257844.2023.10293696
M3 - Conference contribution
AN - SCOPUS:85178340192
T3 - Proceedings of 2023 IEEE International Smart Cities Conference, ISC2 2023
BT - Proceedings of 2023 IEEE International Smart Cities Conference, ISC2 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th IEEE International Smart Cities Conference, ISC2 2023
Y2 - 24 September 2023 through 27 September 2023
ER -