Unsupervised visible-infrared person re-identification (USVI-ReID) plays a crucial role in computer vision. The key challenge of USVI-ReID is to learn the discriminative features of images and establish cross-modal correspondence without using class labels. We propose a two-stage contrastive learning method for USVI-ReID. The first stage is instance-wise contrastive learning for learning a discriminative model. The learned discriminative model is transferred to the second stage for clustering operation, thus forming category-level supervision and promoting the execution of cluster-wise contrastive learning. Besides, a progressive training strategy is proposed to gradually shift the model’s attention from instances to clusters. Extensive experiments on two public datasets SYSU-MM01 and RegDB demonstrate the effectiveness of the proposed method.
Unsupervised embedding learning aims to learn highly discriminative features of images without using class labels. Existing instance-wise softmax embedding methods treat each instance as a distinct class and explore the underlying instance-to-instance visual similarity relationships. However, overfitting the instance features leads to insufficient discriminability and poor generalizability of networks. To tackle this issue, we introduce an instance-wise softmax embedding with cosine margin (SEwCM), which for the first time adds margin in the unsupervised instance softmax classification function from the cosine perspective. The cosine margin is used to separate the classification decision boundaries between instances. SEwCM explicitly optimizes the feature mapping of networks by maximizing the cosine similarity between instances, thus learning a highly discriminative model. Exhaustive experiments on three fine-grained image datasets demonstrate the effectiveness of our proposed method over existing methods.
Sample specificity learning aims to treat every single sample as a separate class and mine the underlying class-to-class visual similarity relationship, thus learning discriminative feature embeddings without using category labels. We introduce a correlational instance feature embedding approach to improve the representation ability of deep neural networks. It exploits the self-correlation and cross-correlation of instances in each training batch by learning a feature embedding with intrainstance variation and interinstance interpolation, resulting in stronger discriminability and better generalizability. The exhaustive experiments on several benchmarks show the performance advantages of our proposed method over the existing methods.
Can we automatically learn discriminative embedding features from images when human-annotated labels are absent? The problem of unsupervised embedded learning remains a significant and open challenge in image and vision community. A joint online deep embedded clustering and hard samples mining framework are proposed to improve the representation ability of embedded learning. In addition, to enhance the discriminability of feature representations, a structure-level pair-based loss is introduced to take full advantage of structure correlation between all the mined hard samples. Finally, the quantitative results of exhaustive experiments on three benchmarks show that our proposed method performs better than existing state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.