Object detection in unmanned aerial vehicle (UAV) images has recently become a popular tool with widespread applications, but due to the high altitude of UAV and the subtle inter-class differences, effectively detecting object locations and achieving accurate classification remain a challenge. Existing methods often rely on multi-scale feature fusion and attention mechanisms to address these problems. However, these methods tend to significantly increase the computational overhead and introduce unwanted background noise interference in shallow features. To address these issues, we propose a tiny object detection network for aerial imagery based on a deformable cross-attention module and enhanced classifier module (DCENet). Specifically, a self-designed deformable cross-attention module makes shallow feature maps adaptively focus on regions of interest in deep feature maps, enhancing the detection of object locations without being affected by additional background noise interference. Meanwhile, a crop-images super-resolution module (CSRM) is used to enlarge the cropped images, solving the problem that tiny objects have lower resolution and appear more blurred compared with ground-view objects. In addition, an enhanced classifier module is adopted to enhance the model’s classification capability for some similar categories, thereby improving the overall performance of the network. The module integrates both the CSRM and the ResNet-34 classifier. The experimental results show that DCENet achieves state-of-the-art performance with mean average precision values of 36.0 and 27.5 on the VisDrone-2019 and UAVDT datasets, respectively. This suggests that the proposed DCENet is more suitable for aerial image detection. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one