Author
Listed:
- Wimolsree Getsopon
(Multi-Agent Intelligent Simulation Laboratory (MISL) Research Unit, Department of Information Technology, Faculty of Informatics, Mahasarakham University, Mahasarakham 44150, Thailand)
- Sirawan Phiphitphatphaisit
(Department of Information System, Faculty of Business Administration and Information Technology, Rajamangala University of Technology Isan Khon Kaen Campus, Khon Kaen 40000, Thailand)
- Emmanuel Okafor
(SDAIA-KFUPM Joint Research Center for Artificial Intelligence, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia)
- Olarik Surinta
(Multi-Agent Intelligent Simulation Laboratory (MISL) Research Unit, Department of Information Technology, Faculty of Informatics, Mahasarakham University, Mahasarakham 44150, Thailand)
Abstract
Intelligent video analysis tools have advanced significantly, with numerous cameras installed in various locations to enhance security and monitor unusual events. However, the effective detection and monitoring of violent incidents often depend on manual effort and time-consuming analysis of recorded footage, which can delay timely interventions. Deep learning has emerged as a powerful approach for extracting critical features essential to identifying and classifying violent behavior, enabling the development of accurate and scalable models across diverse domains. This study presents the Int.2D-3D-CNN architecture, which integrates a two-dimensional convolutional neural network (2D-CNN) and 3D-CNNs for video-based violence recognition. Compared to traditional 2D-CNN and 3D-CNN models, the proposed Int.2D-3D-CNN model presents improved performance on the Hockey Fight, Movie, and Violent Flows datasets. The architecture captures both static and dynamic characteristics of violent scenes by integrating spatial and temporal information. Specifically, the 2D-CNN component employs lightweight MobileNetV1 and MobileNetV2 to extract spatial features from individual frames, while a simplified 3D-CNN module with a single 3D convolution layer captures motion and temporal dependencies across sequences. Evaluation results highlight the robustness of the proposed model in accurately distinguishing violent from non-violent videos under diverse conditions. The Int.2D-3D-CNN model achieved accuracies of 98%, 100%, and 98% on the Hockey Fight, Movie, and Violent Flows datasets, respectively, indicating strong potential for violence recognition applications.
Suggested Citation
Wimolsree Getsopon & Sirawan Phiphitphatphaisit & Emmanuel Okafor & Olarik Surinta, 2025.
"Int.2D-3D-CNN: Integrated 2D and 3D Convolutional Neural Networks for Video Violence Recognition,"
Mathematics, MDPI, vol. 13(16), pages 1-35, August.
Handle:
RePEc:gam:jmathe:v:13:y:2025:i:16:p:2665-:d:1727782
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:13:y:2025:i:16:p:2665-:d:1727782. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.