Paper
13 June 2024 Optimized object detection with score attention and spatially modulated co-attention
Zhenjun Wu, Zhong Qu, Shuheng Zhao
Author Affiliations +
Proceedings Volume 13180, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2024); 131805U (2024) https://doi.org/10.1117/12.3033638
Event: International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2024), 2024, Guangzhou, China
Abstract
Detection Transformer (DETR) is a novel end-to-end object detection paradigm based on the Transformer structure proposed in recent years. It eliminates the need for manual components like Non-Maximum Suppression (NMS) to achieve comparable accuracy to traditional detection methods. Despite its innovative approach, there are still limitations, particularly in terms of low detection accuracy for small objects and long convergence time of the model, which hinder its effectiveness across various detection scenarios. In this paper, we propose an enhanced DETR encoder and introduce a spatially modulated co-attention module during the decoder phase to address these issues. The enhanced encoder utilizes a new score attention mechanism characterized by low computational complexity and enhances the detection accuracy of small objects through multi-scale feature fusion. Additionally, we introduce a spatially modulated coattention module to maintain query stability during the decoder stage by incorporating spatial priors. Experimental results demonstrate that our proposed algorithm exhibits excellent performance with an achieved mAP@0.5 of 45.2% on the MSCOCO 2017 dataset.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Zhenjun Wu, Zhong Qu, and Shuheng Zhao "Optimized object detection with score attention and spatially modulated co-attention", Proc. SPIE 13180, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2024), 131805U (13 June 2024); https://doi.org/10.1117/12.3033638
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

Data modeling

Modulation

Education and training

Performance modeling

Transformers

Deformation

Back to Top