Presentation + Paper
7 June 2024 Multi-objective optimization with homotopy-based strategies for enhanced multimodal automatic target recognition models
Author Affiliations +
Abstract
Automatic Target Recognition (ATR) often confronts intricate visual scenes, necessitating models capable of discerning subtle nuances. Real-world datasets like the Defense Systems Information Analysis Center (DSIAC) ATR database exhibit unimodal characteristics, hindering performance, and lack contextual information for each frame. To address these limitations, we enrich the DSIAC dataset by algorithmically generating captions and proposing new train/test splits, thereby creating a rich multimodal training landscape. To effectively leverage these captions, we explore the integration of a vision-language model, specifically Contrastive Language-Image Pre-training (CLIP), which combines visual perception with linguistic descriptors. At the core of our methodology lies a homotopy-based multi-objective optimization technique, designed to achieve a harmonious balance between model precision, generalizability, and interpretability. Our framework, developed using PyTorch Lightning and Ray Tune for advanced distributed hyperparameter optimization, enhances models to meet the intricate demands of practical ATR applications. All code and data is available at https://github.com/sabraha2/ATR-CLIP-Multi-Objective-Homotopy-Optimization.
Conference Presentation
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Sophia Abraham, Steve Cruz, Suya You, Jonathan D. Hauenstein, and Walter Scheirer "Multi-objective optimization with homotopy-based strategies for enhanced multimodal automatic target recognition models", Proc. SPIE 13039, Automatic Target Recognition XXXIV, 1303903 (7 June 2024); https://doi.org/10.1117/12.3012394
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Education and training

Automatic target recognition

Data modeling

Visual process modeling

Machine learning

Visualization

Performance modeling

Back to Top