For the synthetic aperture radar (SAR) target recognition task, feature representation accuracy determines the recognition performance. The previous mainstream deep learning–based SAR target recognition methods required a large amount of labeled samples to learn feature representation and made superior performance. However, only utilizing unlabeled samples to solve SAR target recognition is still a challenging task. To solve this problem, we propose an end-to-end multi-view feature enhancement-based contrastive clustering framework (MvECC) for unsupervised SAR target recognition. It utilizes the view of internal information and complementary information among adjacent views from multiple-view SAR images to learn discriminative target features. MvECC first augments the multi-view image sequences to build sequence pairs and feeds them into a pair of weight-sharing multi-view Vision Transformers (ViT) to extract features. The designed multi-view ViT has an intra-view and an inter-view Transformer layer, and it can capture and fuse the feature within each view and among different views from multi-view image sequences. Then, contrastive learning is performed at the sequence and class levels, which aims at optimizing pairwise similarity to learn feature representations and obtain clustering assignment results. The clustering accuracy of our method outperforms the state-of-the-art method by 7.87% on the moving and stationary target acquisition and recognition dataset. More experimental results on the synthetic and measured paired labeled experiment dataset show that MvECC has good robustness. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Synthetic aperture radar
Target recognition
Transformers
Feature extraction
Education and training
Machine learning
Matrices