Vehicle re-identification is the task of recognizing the same vehicle from non-overlapping camera views, which holds significant importance in areas such as intelligent security and smart transportation. Vehicle images present the challenges of high inter-class variation and low intra-class variation. Existing methods often employ additional cues and auxiliary inputs to address these challenges, but they face issues of high data requirements and computational complexity. To overcome these challenge, this paper introduce a vehicle re-identification method based on CSWin Transformer and linear attention to effectively extract global contextual information suitable for vehicle images. By replacing the original attention mechanism in CSWin Transformer with linear attention, we achieve strong modeling capabilities while limiting computational costs. Specifically, linear attention employs a simple yet effective mapping function and rank restoration module to focus on specific local regions and enhance the interaction of local features. Experimental results demonstrate that this method achieves a better balance between computational efficiency and model performance.
|