Research on memory-enhanced mobile video object detection

Shi Lei; Shiqiang Zhang; Xiaodong Cheng

doi:10.1117/12.2634397

12 May 2022 Research on memory-enhanced mobile video object detection

Shi Lei, Shiqiang Zhang, Xiaodong Cheng

Proceedings Volume 12173, International Conference on Optics and Machine Vision (ICOMV 2022); 1217307 (2022) https://doi.org/10.1117/12.2634397
Event: International Conference on Optics and Machine Vision (ICOMV 2022), 2022, Guangzhou, China

Abstract

In recent years, deep learning for mobile object detection has received increasing attention, and at the same time,handdesigned neural networks for mobile object detection have made remarkable progress, there is still a wide scope for exploring how to ensure accuracy and speed while pursuing the design of networks with fewer parameters and less computation, i.e., small and fast. Inspired by biological science, we know that humans only need to gaze at a glance, their visual system can have a rich perception of the complex environment and achieve a rough understanding of target detection, and the realization of this result relies on the corresponding prior knowledge, we propose a strategy of feature extractor crossover to adjust the results of the detection obtained by NAS with hyperparameters to achieve improve the detection performance on mobile devices.

In this paper, we propose an efficient method that combines the MobileNetV3 high-speed and the improved LSTM with temporal memory, and this network can effectively capture the key parts of the scene. Experiments show that we achieve an improvement in accuracy while ensuring the most important speedup in mobile. We reduction in the number of parameters from 4.9million to 3.96million, while reaching 90+ FPS in the Samsung Note10+ phone.

Citation Download Citation

Shi Lei, Shiqiang Zhang, and Xiaodong Cheng "Research on memory-enhanced mobile video object detection", Proc. SPIE 12173, International Conference on Optics and Machine Vision (ICOMV 2022), 1217307 (12 May 2022); https://doi.org/10.1117/12.2634397

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
6 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Video

Data modeling

Feature extraction

Target detection

Video processing

Visual system

Detection theory

Keywords/Phrases

Search In:

Publication Years