In recent years, deep learning for mobile object detection has received increasing attention, and at the same time,handdesigned neural networks for mobile object detection have made remarkable progress, there is still a wide scope for exploring how to ensure accuracy and speed while pursuing the design of networks with fewer parameters and less computation, i.e., small and fast. Inspired by biological science, we know that humans only need to gaze at a glance, their visual system can have a rich perception of the complex environment and achieve a rough understanding of target detection, and the realization of this result relies on the corresponding prior knowledge, we propose a strategy of feature extractor crossover to adjust the results of the detection obtained by NAS with hyperparameters to achieve improve the detection performance on mobile devices. In this paper, we propose an efficient method that combines the MobileNetV3 high-speed and the improved LSTM with temporal memory, and this network can effectively capture the key parts of the scene. Experiments show that we achieve an improvement in accuracy while ensuring the most important speedup in mobile. We reduction in the number of parameters from 4.9million to 3.96million, while reaching 90+ FPS in the Samsung Note10+ phone. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Video
Data modeling
Feature extraction
Target detection
Video processing
Visual system
Detection theory