Paper
12 September 2024 Fine-tuned Segment Anything Model based on dual-stage adapter for remote sensing rotate object detection
Kang Song, Qing Zhu, Hui Yang, Biao Wang, Yanlan Wu
Author Affiliations +
Proceedings Volume 13256, Fourth International Conference on Computer Vision and Pattern Analysis (ICCPA 2024); 1325603 (2024) https://doi.org/10.1117/12.3037831
Event: Fourth International Conference on Computer Vision and Pattern Analysis (ICCPA 2024), 2024, Anshan, China
Abstract
In recent years, extensive research has been conducted in the field of remote sensing target detection, focusing on various aspects of the model such as the backbone network, neck, detection head, and loss functions, leading to significant advancements. However, a crucial factor influencing detection results is the model's capability to extract features from images, particularly when dealing with remote sensing images characterized by complex features. The Segment Anything Model (SAM), a prominent large-scale model in the field of computer vision, has recently attracted significant attention. It boasts powerful feature extraction capabilities and strong generalization, yielding impressive results across various image types. Nonetheless, its primary application lies in semantic segmentation, rendering it unsuitable for direct application to remote sensing rotated object detection. In this paper, we propose a remote sensing rotated object detection method with fine-tuned segment anything model based on dual-stage adapter, denoted as FSAMDA. We utilize the Adapter to learn specific knowledge for remote sensing object detection tasks, while the Mona Adapter simultaneously enhances its ability to process visual signals and learns specific knowledge for remote sensing object detection. This approach maximizes the powerful feature extraction capabilities of the SAM image encoder.Our proposed FSAMDA method has been validated through numerous experiments, showcasing state-of-the-art performance on two widely utilized standard benchmarks, namely DOTA-v1.0 and FAIR1M-v1.0. Specifically, we achieved scores of 81.35 mAP on DOTA-v1.0 and 48.52 mAP on FAIR1M-v1.0, demonstrating the effectiveness of our approach.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Kang Song, Qing Zhu, Hui Yang, Biao Wang, and Yanlan Wu "Fine-tuned Segment Anything Model based on dual-stage adapter for remote sensing rotate object detection", Proc. SPIE 13256, Fourth International Conference on Computer Vision and Pattern Analysis (ICCPA 2024), 1325603 (12 September 2024); https://doi.org/10.1117/12.3037831
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

Remote sensing

Feature extraction

Education and training

Image segmentation

Visualization

Feature fusion

RELATED CONTENT


Back to Top