PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 11729, including the Title Page, Copyright information and Table of Contents
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Digital Elevation Model (DEM) production is one of the most time consuming tasks in digital photogrammetry. By applying machine learning to Digital Photogrammetry, our Intelligent Photogrammetry can significantly reduce the cost of DEM production from Digital Surface Models (DSM), which are generated from satellite images, aerial images, Unmanned Aerial Vehicle (UAV) images, sparse LiDAR 3-D point clouds, and dense LiDAR 3-D point clouds. There are various types of DSM, each containing different post spacing and accuracy. The following sets of 3-D models have been trained based on the different DSM types: 1. 3DLargeBuildingModel 2. 3DBuildingModel 3. 3DHouseModel 4. 3DTreeModel 5. 3DGroundPointModel The first four models detect above ground 3-D objects and then remove them from DSM to generate DEM. The last model classifies 3-D points into thirteen categories, which are then used to generate DEM in difficult terrain such as dense forestry areas, where the ground is mostly unseen. The main cost of DEM production using DSM generated from satellite images in difficult terrain is the transformation from DSM to DEM. Traditional handcrafted bare earth algorithms for DSM to DEM transformation cannot deal with so many different cases for general purpose application and big data. Intelligent Photogrammetry, based on machine learning, can handle different cases by adding training samples. For this case study, the city of San Diego was used to generate DEM from Intelligent Photogrammetry to achieve Root Mean Square Error (RMSE) of 0.95 meters from stereo satellite images. This case study indicates that Intelligent Photogrammetry can reduce the DEM production cost by more than 50%. The most time consuming component of DEM production is dense forestry areas, and in this case study, the forestry height is up to 19 meters causing the ground to be nearly invisible. This issue was resolved with 3DGroundPointModel based on machine learning, achieving RMSE 2.40 meter and meeting the desired DEM accuracy requirement of 2 to 3 meters using stereo satellite images. DEM production from UAV images using our Intelligent Photogrammetry can achieve state-of-the-art accuracy. The Intelligent Photogrammetry can identify errors in DSM generated from UAV images and correct them; therefore, providing a very competitive DEM generation capability for UAV images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, the field of artificial intelligence has shown significant progress in solving problems that seemed impossible just a few years ago: from the real-time autonomous control of super-pressure balloons in the stratosphere to reducing the energy required for cooling a data centre, and from training robots to acquire skills that generalize effectively to diverse real-world objects and situation to improving mobile phone battery use and screen brightness for millions of people. Underlying all these real-world success stories is a collection of methods known as deep reinforcement learning which has proven superior at finding near-optimal policies of behaviors. We envision that combining deep reinforcement learning with automatic target recognition, which extracts actionable military information from sensor data, will have a meaningful impact on the future of warfare.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Standard ATR algorithms suffer from a lack of transparency into why the algorithm recognized a particular object as a target. We present an enhanced Explainable ATR algorithm that utilizes super-resolution networks to provide increased robustness. XATR is a two-level network, with the lower level using Region-based Convolution Neural Networks (R-CNNs) to recognize major parts of the target, known as vocabulary. The upper level employs Markov Logic Networks (MLN) and structure learning to learn the geometric and spatial relationships between the parts in the vocabulary that best describe the objects. Image degradation due to noise, blurring, decimation, etc., can severely impact XATR performance as feature content is irrevocably lost. We address this by introducing a novel super-resolution network. This network uses a dynamic u-net design. A ResNet is on the encoder path while the imagery is reconstructed with dynamically linked upsampling heads in the decoder path. The network is trained on high resolution and degraded imagery pairs to super-resolve the degraded imagery. The trained dynamic u-net then super-resolves unseen degraded imagery to improve XATR’s performance compared to lost performance when using the degraded imagery. In this paper, we perform experiments to 1) Determine the sensitivity of XATR to image corruption 2) Improve XATR performance with super-resolution and 3) Demonstrate XATR robustness to image degradation and occlusion. Our experiments demonstrate improved recall (+40%) and accuracy (+20%) on degraded images when super-resolution is applied.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep learning-based classification of objects in overhead imagery is a difficult problem to solve due to low to moderate available resolution as well as wide ranges of scale between objects. Traditional machine learning object classification techniques yield sub-optimal results in this scenario, with new techniques developed to optimize performance. Our Lockheed Martin team has developed data pre-processing techniques such as context masking and uniform rotation which improve classifier performance in this application. Additionally, we have demonstrated that shallow classifier models perform at least as well as deeper models in this paradigm, allowing for fast training and inference times.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present preliminary results of infra-red target detection using a target-to-clutter based deep learning network (TCRnet). We augment this network with a separate processing path to render a new Directed Acyclic Graph network (TCRDAG) amenable to transfer learning. This transfer learning is used for network adaptation to new observations. The ROC curve shows significant improvement, particularly at the right side of the ROC curve. We further explore a boosting paradigm to improve the ROC curve for the left side. We then present results on a publicly available MWIR dataset released by NVESD.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neural network approaches have periodically been explored in the pursuit of high performing SAR ATR solutions. With deep neural networks (DNNs) now offering many state-of-the-art solutions to computer vision tasks, neural networks are once again being revisited for ATR processing. Here, we characterize and explore a suite of neural network architectural topologies. In doing so, we assess how different architectural approaches impact performance and consider the associated computational costs. This includes characterizing network depth, width, scale, connectivity patterns, as well as convolution layer optimizations. We have explored a suite of architectural topologies applied to both the canonical MSTAR dataset, as well as the more operationally realistic Synthetic and Measured Paired and Labeled Experiment (SAMPLE) dataset. The latter pairs high fidelity computational models of targets with actual measured SAR data. Effectively, this dataset offers the ability to train a DNN on simulated data and test the network performance on measured data. Not only does our in-depth architecture topology analysis offer insight into how different architectural approaches impact performance, but we also have trained DNNs attaining state-of-the-art performance on both datasets. Furthermore, beyond just accuracy, we also assess how efficiently an accelerator architecture executes these neural networks. Specifically, Using an analytical assessment tool, we forecast energy and latency for an edge TPU like architecture. Taken together, this tradespace exploration offers insight into the interplay of accuracy, energy, and latency for executing these networks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
T-distributed Stochastic Neighbor Embedding (t-SNE) has become an extremely popular algorithm for low- dimensional visualization of high dimensional data. While it is acknowledged that it is highly sensitive to its parameters, it continues to be used extensively by the machine learning community, with `intuition' an accepted basis for embedding selection. In this paper, we will illustrate and explain why t-SNE is not a distance preserving algorithm, but rather order preserving, with the cardinality of the order proportional to the perplexity parameter. We compare and contrast t-SNE with Sammon Nonlinear Mappings locally using Kruskal Stress and Spearman Rank Correlation measures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep learning has become the leading approach to assisted target recognition. While these methods typically require large amounts of labeled training data, domain adaptation (DA) or transfer learning (TL) enables these algorithms to transfer knowledge from a labelled (source) data set to an unlabelled but related (target) data set of interest. DA enables networks to overcome the distribution mismatch between the source and target that leads to poor generalization in the target domain. DA techniques align these distributions by minimizing a divergence measurement between source and target, making the transfer of knowledge from source to target possible. While these algorithms have advanced significantly in recent years, most do not explicitly leverage global data manifold structure in aligning the source and target. We propose to leverage global data structure by applying a topological data analysis (TDA) technique called persistent homology to TL. In this paper, we examine the use of persistent homology in a domain adversarial (DAd) convolutional neural network (CNN) architecture. The experiments show that aligning persistence alone is insufficient for transfer, but must be considered along with the lifetimes of the topological singularities. In addition, we found that longer lifetimes indicate robust discriminative features and more favorable structure in data. We found that existing divergence minimization based approaches to DA improve the topological structure, as indicated over a baseline without these regularization techniques. We hope these experiments highlight how topological structure can be leveraged to boost performance in TL tasks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Folk wisdom on the subject of human knowledge holds that it is “better to know nothing than to know what ain’t so1”. In some circumstances that precept may be particularly important. If the negative consequences of false knowledge are sufficiently severe, we may be willing to forgo the benefits of knowing some facts to avoid the dangers of believing “facts” that are incorrect. This is the foundation of the “innocent until proven guilty” system of justice. According to English jurist William Blackstone, “it is better that ten guilty persons escape than that one innocent suffer2”. Similar principles apply wherever it is especially harmful to act upon false beliefs. If we wish to employ machine learning as an aid to human judgment, it may in some cases be advisable to insist upon near-certainty from the machine’s reported results. One way of working towards this goal is to use an ensemble of different agents. If their results are consistent with each other, we can have greater confidence in their overall reliability. We can also set a threshold value for the average confidence of the agents themselves. This paper explores the decision-making process for developing an ensemble of classifiers, and evaluates the results in the context of an example set. This set is appropriately categorized by a hierarchical structure, which permits less-specific judgments to be made if confidence falls below our predetermined threshold. We examine the tradeoffs to be made when setting parameters, and discuss aligning them with overarching requirements.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Semantic segmentation is an important task in computer vision that aims to infer pixel-level semantic label information in images. Recently significant progresses have been made in deep neural networks-driven segmentation techniques, but these often require a large number of labels to supervise neural network training. Obtaining a sufficient number of labeled training data is challenging and sometimes impractical in real-world applications. This paper aims to study a semi-supervised semantic segmentation via image-to-image (I2I) translation technique. I2I is an emerging technique that maps an image from a domain to another domain. We uniquely treat the image semantic segmentation as an I2I translation task that infers semantic labels of objects (target domain) from input image (source domain) in a weakly supervised way. Particularly, we develop a two-pass strategy of I2I combining images with real and pseudo labels for semi-supervised model learning. The first pass uses unsupervised models to generate pseudo labels that combine with inputs to form pseudo-labeled samples. Since the pseudo-labeled images may undermine the quality of the model, they have to be specifically constrained in training by a noise correction framework to ensure good performance. Therefore, we boost the performance by incorporating both real and the pseudo labeled samples into the second pass to train a model based on a supervised architecture. Our goal is to bridge the gap between supervised and unsupervised learning for semantic object segmentation in practical. Extensive evaluations are conducted to demonstrate the efficiency and effectiveness of the proposed technique.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep metric learning is an approach to establish a distance metric between data to measure the similarity. Metric learning methods map an input image to a representative feature space where semantically similar samples are closer and dissimilar examples are far away. The use of metric learning in fine-grained recognition is widely studied in recent years. Fine-grained Recognition (FGR) focuses on categorizing hard-to-distinguish classes such as birds' species and models of the cars. In the FGR datasets, the intra-class variance is high, while the inter-class variance is low. This makes it challenging to annotate them, leading to erroneous labels. Especially in defense applications, it is quite costly to label the data due to the fact that this work should be done by experts. The performance of the metric learning methods is directly related to the loss function exploited during training the model. The loss functions are divided into two categories: Pair-based and proxy-based approaches. A proxy is a representation of distribution in feature space. Although the pair-based loss functions utilize the data-to-data relations, the proxy-based loss functions exploit data-to-proxy relations. In this paper, we analyzed the effect of the label noise on open-set fine-grained recognition performance. The pair-based and proxy-based methods are evaluated on three widely adopted benchmark datasets: CUB 200-2011, Stanford Cars 196, and FGVC Aircraft.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The increased availability of UAVs poses possibilities for novel applications but also safety issues regarding mass events or safety-sensitive infrastructure. Thus, the demand for automated UAV detection systems to allow for an early alert generation is increasing. Such systems often rely on electro-optical imagery and deep learning to detect UAVs. However, the absence of a large and diverse dataset for training may result in an error-prone learning process. In this work, we investigate how far these issues can be mitigated without relying on extra data, which is often costly to obtain and annotate. We thus evaluate and demonstrate the impact of different data augmentation strategies to enhance our available training data. We evaluate how the different methods increase the robustness of several state of the art deep learning based detectors. Particularly, we focus our evaluation on the aspects of false alarms caused by distractor objects or by complex background.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes an action recognition method based on the 3D local binary dense micro-block difference. The proposed algorithm is a three-stage procedure: (a) image preprocessing using a 3D Gabor filter, (b) a descriptor calculation using 3D local binary dense micro-block difference with skeleton points, and (c) SVM classification. The proposed algorithm is based on capturing 3D sub-volumes located inside a video sequence patch and calculating the difference in intensities between these sub-volumes. For intensifies motion used the convolution with a bank of 3D arbitrarily-oriented Gabor filters. We calculate the local features for pre-processed frames, such as 3D local binary dense micro-block difference (3D LBDMD). We evaluate the proposed approach on the UCF101 database. Experimental results demonstrate the effectiveness of the proposed approach on video with a stochastic textures background with comparisons of the state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The multidisciplinary area of geospatial intelligence (GEOINT) is continually changing and becoming more complex. From efforts to automate portions of GEOINT using machine learning, which augment the analyst and improve exploitation, to optimizing the growing number of sources and variables, there is no denying that the strategies involved in this collection method are rapidly progressing. The unique and inherent complexities involved in imagery analysis from an overhead perspective—e.g., target resolution, imaging band(s), and imaging angle{|test the ability of even the most developed and novel machine learning techniques. To support advancement in the application of object detection in overhead imagery, we have developed a spin-set augmentation method that leverages synthetic data generation capabilities to augment the training data sets. We then test this method with the popular object detection deep network YOLOv4. This paper analyzes the synthetic augmentation method in terms of algorithm detection performance, computational complexity, and generalizability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many problems in defense and automatic target recognition (ATR) require concurrent detection and classification of objects of interest in wide field-of-view overhead imagery. Traditional machine learning approaches are optimized to perform either detection or classification individually; only recently have algorithms expanded to tackle both problems simultaneously. Even highly performing parallel approaches struggle to disambiguate tightly clustered objects, often relying on external techniques such as non-maximum suppression. We have developed a hybrid detection-classification approach that optimizes the segmentation of closely spaced objects, regardless of size, shape, and object diversity. This improves overall performance for both the detection and classification problems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a bare-earth algorithm based on Markov Random Field image segmentation. Many bare earth algorithms exist that were developed for LiDAR. However, a new algorithm was needed to extract bare-earth from point clouds produced by stereo-matching multi-view satellite imagery (called electro-optical (EO) point clouds). EO point clouds have characteristics that pose challenges distinct from LiDAR such as substantially greater noise levels and missing data due to object occlusion. Despite these challenges, the algorithm accurately extracts bare-earth from EO point clouds. Additionally, the algorithm is robust to sensor type, which was demonstrated by applying the algorithm to LiDAR surveys collected with different sensors. The algorithm is shown to be robust to different levels of urban development and terrain variability and achieves a 94% accuracy on average when compared to manually classified point clouds.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Estimating building height from satellite imagery is important for digital surface modeling while also providing rich information for change detection and building footprint detection. The acquisition of building height usually requires a LiDAR system, which is not often available in many satellite systems. In this paper, we describe a building height estimation method that does not require building height annotation. Our method estimates building height using building shadows and satellite image metadata given a single RGB satellite image. To reduce the data annotation needed, we design a multi-stage instance detection method for building and shadow detection with both supervised and semi-supervised training. Given the detected building and shadow instances, we can then estimate the building height with satellite image metadata. Building height estimation is done by maximizing the overlap between the projected shadow region given a query height and the detected shadow region. We evaluate our method on the xView2 and Urban Semantic 3D datasets and show that the proposed method achieves accurate building detection, shadow detection, and height estimation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Remote sensing is considered as an increasingly important technology for Maritime ecosystem. The aim of this paper is to show how to detect and map oil spills in the Cyprus Region using freely available Sentinel 1 SAR imagery data. Oil spill is automatically detected using Oil Spill detection tool of Sentinel Application Platform (SNAP). The methodology is applied on several satellite images to identify the effects of oil spill compared to the clean sea conditions. The comparison was made using two different polarization; vertical polarization (VV) and horizontal comparison (VH). The preliminary results shows that the Sentinel-1 SAR data would give effective results and spatial information on oil spill detection to decision-makers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Change detection between two temporal scenes of overhead imagery is a common problem in many applications of computer vision and image processing. Traditional change detection techniques only provide a pixel level detail of change and are sensitive to noise and variations in images such as lighting, season, perspective. We propose a deep learning approach that exploits a segmentation detector and classifier to perform object level change detection. This allows us to create class level segmentation masks of a pair of images collected from the same location at different times. This pair of segmentation masks can be compared to detect altered objects, providing a detailed report to a user on which objects in a scene have changed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The increasing availability of drones and their flexible employment for surveillance tasks will lead to large amounts of aerial video data in the near future. Similar to camera network data, such large data volumes will pose a challenge when fast analysis of the data is required, for example after a security incident. Key automated tasks that can help make the data more easily navigable are detection and re-identification of persons. While both tasks pose a challenge in themselves, the combination of both, often called person search, can often be of greatest benefit to analysts. In this work we address the task of person search in aerial images on the newly available P-DESTRE dataset. In particular our work aims at investigating the suitability of existing methods for person detection and re-identification in the aerial domain and taking a look at how top performing methods can be combined to realize an aerial person search system. Besides evaluation of the individual components we focus on analyzing the interplay between the detection and re-identification methods. In particular, we look at wether errors from the detection stage, such as misaligned detections or false positive detection, strongly affect the re-identification accuracies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep-learning-based automatic image decision systems are increasingly being relied on to analyze imagery that was previously only viewed and interpreted by humans. Here, we present a real-time method of validating the quality of images input to image decision systems. We focus on the detection of concealed contraband in millimeter-wave (MMW) images of screened people, but the method is general enough to be useful for other applications, such as medical image analysis systems. In applications of such critical importance, it is imperative that automatic target recognition (ATR) algorithms behave predictably and robustly. For example, a MMW system deployed in an airport could suffer from changes in image quality due to a variety of factors, for example, partial hardware malfunctions, excessive vibration, or lack of maintenance or calibration. In such scenarios, it would be desirable to be able to detect changes in image quality immediately when they happen. We investigate the performance of a deep-learning-based ATR when fed with variable quality input images. We describe a first-of-its-kind method to validate the quality of images input to an ATR. The real-time method uses statistical measurements intrinsic to natural images to assess the similarity between an input image and a set of training images. We show, through multiple experiments, that ATR performance is poorer for images with quality that is different than the training set, whether the quality is better or worse. The method is successfully demonstrated as a training-free validation tool for ATR algorithms using two state-of-the-art deep-learning architectures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object detection AI’s enable robust solutions for fast, automated detection of anomalies in operating environments such as airfields. Implementation of AI solutions requires training models on a large and diverse corpus of representative training data. To reliably detect craters and other damage on airfields, the AI must be trained on a large, varied, and realistic set of images of craters and other damage. The current method for obtaining this training data is to set explosives in the concrete surface of a test airfield to create actual damage and to record images of this real data. This approach is extremely expensive and time consuming, results in relatively little data representing just a few damage cases and does not adequately represent damage to UXO and other artifacts that are detected. To address this paucity of training data, we have begun development of a training data generation and labeling pipeline that leverages Unreal Engine 4 to create realistic synthetic environments populated with realistic damage and artifacts. We have also developed a labeling system for automatic labeling of the detection segments in synthetic training images, in order to provide relief from the tedious and time-consuming process of manually labeling segments in training data and eliminate human errors incurred by manual labeling. We present comparisons of performance of two object detection AI’s trained on real and synthetic data and discuss cost and schedule savings enabled by the automated labeling system used for labeling of detection segments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Experimental results are presented from an investigation that evaluated the effects of introducing degraded imagery into the training and test sets of an algorithm. Degradation consisted of various applied MTFs (blur) and noise profiles. The hypothesis was that the introduction of degraded imagery into the training set would increase the algorithm's accuracy when degraded imagery was present in the test set. Preliminary experimentation confirmed this hypothesis, with some additional observations regarding robustness and feature selection for degraded imagery. Further investigations are suggested to advance this work, including increased variety of objects for classification, additional wave bands, and randomized degradations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The focus in this paper is on fusing compressively sensed radar data to accomplish non-cooperative radar target identification. The primary motivation is to assess the benefit of fusing compressively sensed radar signals in target identification compared to systems that do not use compressive sensing, and systems that fuse individual sensor decisions instead of sensor data. The paper will use fusion techniques developed in the past decade to combine compressively sensed radar returns and then render a target classification decision. The paper shows the difference between fusing the radar data and making a decision and fusing target identification decisions made at individual compressively sensed radar systems. Reconstructing the radar target down range profile from a fusion of compressively sensed data is also examined. Alternatively, scattering centers are extracted at each separate radar system and fused as features for a radar target recognition system. The radar used in this study is a stepped frequency radar. The radar returns examined represent the backscatter from four commercial aircraft models at various azimuth positions. The data may be corrupted by additive noise. The compressive sensing techniques rely on using a random Gaussian measuring matrix, and the signal recovery techniques use the well-known OMB methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Advanced sensor platforms often contain a wide array of sensors in order to collect and process a diverse range of environmental data. Proper calibration of these sensors is important so that the collected data can be interpreted and fused into an accurate depiction of the environment. Traditionally, LiDAR-stereo camera calibration requires human assistance to manually extract point pairs between the LiDAR and the camera system. Here, we present a fully automated technique for calibrating a visible camera system with a 360⁰ field-of-view LiDAR. This calibration is achieved by using the standard planar checkerboard calibration pattern to calculate the calibration parameters (intrinsic and extrinsic) for the stereo camera system. We then present a novel pipeline to determine accurate rigid-body transformation between LiDAR and the stereo camera coordinate systems with no additional experimental setup or human assistance. Our innovation lies in using the planarity of the checkerboard, whose surface coefficients can be estimated relative to the camera coordinates as well as the LiDAR sensor coordinates. We determine the rigid-body transformation between two sets of coefficients of the same calibration surface through least squares minimization. We then refine the estimate through iterative closest point minimization between the 3D points on the checkerboard pattern viewed from the LiDAR and the camera system. Using measurements from multiple views, we increase the confidence in the transformation estimate. The proposed method is less cumbersome and time consuming, unifying the stereo camera and LiDAR-camera calibration in a single step using only one calibration pattern.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Adaptive image filtering, removing noises without blurring the discontinuity of images, is important for many image processing, pattern recognition and computer vision applications. Many researches including anisotropic diffusion equation techniques have been conducted to address adaptive image filtering problems. Traditional techniques usually use differential characteristics of images to determine filtering coefficients for adaptively filtering images. As is well known, differential characteristics are difficult to estimate and the techniques to compute differential characteristics are usually sensitive to noises due to the intrinsic properties of derivatives. In this paper, we propose discrete Legendre polynomial based adaptive image filtering that effectively remove noises with preserving discontinuity of edges. We use polynomial fitting errors to choose masks to achieve the adaptivity. The fitting errors are computed by integrals (summation). This overcomes the derivative noise-sensitivity problems and allows us to achieve high performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Artificial intelligence (AI) / Machine Learning (ML) applications are widely available for different domains such as commercial, industrial, and intelligence applications. In particular, the use of AI applications for the security environment requires standards to manage expectations for users to understand how the results were derived. A reliance on “black boxes” to generate predictions and inform decisions could lead to errors of analysis. This paper explores the development of potential standards designed for each stage of the development of an AI/ML system to help enable trust, transparency, and explainability. Specifically, the paper utilizes the standards outlined in Intelligence Community Directive 203 (Analytic Standards) to hold machine outputs to the same rigorous accountability standards as performed by humans. Building on the ICD203, the Multi-Source AI Scorecard Table (MAST) was developed to support the community towards test and evaluation of AI/ML techniques. The paper provides discussion towards using MAST to rate a semantic processing tool for processing noisy, unstructured, and complex microtext in the form of streaming chat for video call outs. The scoring is notional, but provides a discussion on how MAST could be used as a standard to compare AI/ML methods that complements datasheets and model cards.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.