To address the issue of high miss rates for distant small objects and the diminished system detection performance due to the influence of hazy when autonomous vehicles operate on mountain highways. We propose a framework for small object vehicle detection in hazy traffic environments (SHTDet). This framework aims to enhance small object detection for autonomous driving under hazy conditions on mountainous motorways. Specifically, to restore the clarity of hazy images, we designed an image enhancement (IE), and its parameters are predicted by a convolutional neural network [filter parameter estimation (FPE)]. In addition, to enhance the detection accuracy of small objects, we introduce a cascaded sparse query (CSQ) mechanism, which effectively utilizes high-resolution features while maintaining fast detection speed. We jointly optimize the IE and the detection network (CSQ-FCOS) in an end-to-end manner, ensuring that FPE module can learn a suitable IE. Our proposed SHTDet method is adept at adaptively handling sunny and hazy conditions. Extensive experiments demonstrate the efficacy of the SHTDet method in detecting small objects on hazy sections of mountain highways.
The current field of autonomous driving has achieved superior object detection performance in good weather conditions. However, the environment sensing capability of autonomous vehicles is severely affected in rainfall traffic environments. Although deep-learning-based image derain algorithms have made significant progress, integrating them with high-level vision tasks, such as object detection, remains challenging due to the significant differences between the derain and object detection algorithms. Additionally, the accuracy of object detection in real rain traffic environments is significantly reduced due to the domain transfer problem between the training dataset and the actual rain environment. To address this domain-shifting problem, we propose an adaptive rain image enhancement object detection network for autonomous driving in adverse weather conditions (ARODNet). This network architecture consists of an image adaptive enhancement module, an image derain module, and an object detection module. The baseline detection module (CBAM-YOLOv7) is built by incorporating the YOLOv7 object detection network into a feed-forward convolutional neural network, and it includes an attention module (CBAM). We propose a domain adaptive rain image enhancement module, DRIP, for low-quality images acquired under heavy rainfall conditions. DRIP enhances low-quality images on rainy days by adaptively learning multiple preprocessing weights. To remove the effects of rain patterns and fog clouds on image detection, we introduce DRIP-enhanced images into the depth estimation derain module (DeRain) to prevent rain and fog from obscuring the objects to be detected. Finally, the multistage joint training strategy is adopted to improve the training efficiency, and the object detection is performed while the image is derained. The efficacy of the ARODNet network for object detection in rainy weather traffic environments has been demonstrated through numerous quantitative and qualitative studies.
With the development of monitoring technology and the improvement of people's security awareness, intelligent human abnormal action recognition technology in the field of action recognition is increasingly high. In most cases, abnormal human action may have little difference in appearance compared with normal behavior, so the control of visual rhythm information becomes an important factor affecting action recognition, but people often focus on the appearance information of the action and ignore the rhythm information. In this paper, we introduce the temporal pyramid module to process the visual tempos information, meanwhile, the traditional LSTM local history information transfer method is very easy to lose the context information, which is not conducive to the grasp of global information and thus will greatly affect the processing effect of the temporal pyramid. This paper introduces a non-local neural network module to enhance the network's ability to grasp global information and the model's long-range modeling capability, which is used to supplement the temporal pyramid module. Finally, this paper uses the mainstream anomaly dataset UCF-Crime to test the network performance, and the improved network model recognition accuracy AUC reaches 0.82, which is better than other stateof-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.