Abstract
The detection and classification of moving objects are fundamental tasks in computer vision. However, current solutions typically employ two isolated processes for detecting and classifying moving objects. First, all objects within the scene are detected, then, a separate algorithm is employed to determine the subset of objects that are in motion. Furthermore, diverse solutions employ complex networks that require a lot of computational resources, unlike lightweight solutions that could lead to widespread use. We propose an enhancement along with an extended explanation of TRG-Net, a unified model that can be executed on computationally limited devices to detect and classify only moving objects. This proposal is based on the Faster R-CNN architecture, MobileNetV3 as a feature extractor, and an improved GMM-based method for a fast and flexible search of regions of interest. TRG-Net reduces the inference time by unifying moving object detection and image classification tasks, limiting the regions proposals to a configurable fixed number of potential moving objects. Experiments over heterogeneous surveillance videos and the Kitti dataset for 2D object detection show that our approach improves the inference time of Faster R-CNN (from 0.176 to 0.149 s) using fewer parameters (from 18.91 M to 18.30 M) while maintaining average precision (AP = 0.423). Therefore, the enhanced TRG-Net achieves more tangible trade-offs between precision and speed, and it could be applied to address real-world problems.
Original language | English |
---|---|
Title of host publication | Computer Vision, Imaging and Computer Graphics Theory and Applications - 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics, VISIGRAPP 2023, Revised Selected Papers |
Editors | A. Augusto de Sousa, Thomas Bashford-Rogers, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Petia Radeva, Giovanni Maria Farinella, Kadi Bouatouch |
Publisher | Springer |
Pages | 161-180 |
Number of pages | 20 |
Volume | 2103 CCIS |
ISBN (Electronic) | 9783031667435 |
ISBN (Print) | 9783031667428 |
DOIs | |
Publication status | Published - 1 Jan 2024 |
Event | 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Lisbon, Portugal Duration: 19 Feb 2023 → 21 Feb 2023 Conference number: 18 https://visapp.scitevents.org/?y=2023 https://visigrapp.scitevents.org/?y=2023 |
Publication series
Series | Communications in Computer and Information Science |
---|---|
Volume | 2103 CCIS |
ISSN | 1865-0929 |
Conference
Conference | 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications |
---|---|
Abbreviated title | VISIGRAPP 2023 |
Country/Territory | Portugal |
City | Lisbon |
Period | 19/02/23 → 21/02/23 |
Internet address |
Keywords
- Classification
- Detection
- Gaussian mixture
- Lightweight model
- Moving objects