Paper
12 May 2022 Research on memory-enhanced mobile video object detection
Shi Lei, Shiqiang Zhang, Xiaodong Cheng
Author Affiliations +
Proceedings Volume 12173, International Conference on Optics and Machine Vision (ICOMV 2022); 1217307 (2022) https://doi.org/10.1117/12.2634397
Event: International Conference on Optics and Machine Vision (ICOMV 2022), 2022, Guangzhou, China
Abstract

In recent years, deep learning for mobile object detection has received increasing attention, and at the same time,handdesigned neural networks for mobile object detection have made remarkable progress, there is still a wide scope for exploring how to ensure accuracy and speed while pursuing the design of networks with fewer parameters and less computation, i.e., small and fast. Inspired by biological science, we know that humans only need to gaze at a glance, their visual system can have a rich perception of the complex environment and achieve a rough understanding of target detection, and the realization of this result relies on the corresponding prior knowledge, we propose a strategy of feature extractor crossover to adjust the results of the detection obtained by NAS with hyperparameters to achieve improve the detection performance on mobile devices.

In this paper, we propose an efficient method that combines the MobileNetV3 high-speed and the improved LSTM with temporal memory, and this network can effectively capture the key parts of the scene. Experiments show that we achieve an improvement in accuracy while ensuring the most important speedup in mobile. We reduction in the number of parameters from 4.9million to 3.96million, while reaching 90+ FPS in the Samsung Note10+ phone.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shi Lei, Shiqiang Zhang, and Xiaodong Cheng "Research on memory-enhanced mobile video object detection", Proc. SPIE 12173, International Conference on Optics and Machine Vision (ICOMV 2022), 1217307 (12 May 2022); https://doi.org/10.1117/12.2634397
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Data modeling

Feature extraction

Target detection

Video processing

Visual system

Detection theory

Back to Top