Few-shot learning (FSL) aims to learn new visual categories from a limited number of labeled examples. The use of mutual information is beneficial for improving the discrimination of feature embeddings, which makes metric classification more accurate. In this context, we propose Dense Masked Deep InfoMax Networks (DMDIN) based on Masked Image Modeling (MIM) framework with convolutional neural networks (CNN). DMDIN effectively incorporates the mutual information maximization strategy into self-supervised pretraining, improving its performance in few-shot learning. More specifically, DMDIN first generates two augmented views of each image through spatial transformations and color jittering for one view, and random RGB mean patching for the other. Second, DMDIN maximizes mutual information between dense features from these views, DMDIN learns more generalized representations, diverging from traditional patch reconstruction-based methods. Our method is evaluated on the miniImageNet, CIFAR-FS, and CUB datasets. The results validate the effectiveness of the proposed method in the few-shot learning scenario.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.