Depth super-resolution is becoming popular in computer vision, and most of test data is based on indoor data sets with ground-truth measurements such as Middlebury. However, indoor data sets mainly are acquired from structured light techniques under ideal conditions, which cannot represent the objective world with nature light. Unlike indoor scenes, the uncontrolled outdoor environment is much more complicated and is rich both in visual and depth texture. For that reason, we develop a more challenging and meaningful outdoor benchmark for depth super-resolution using the state-of-the-art active laser scanning system.
Automatic vehicle detection from aerial images is emerging due to the strong demand of large-area traffic monitoring. In this paper, we present a novel framework for automatic vehicle detection from the aerial images. Through superpixel segmentation, we first segment the aerial images into homogeneous patches, which consist of the basic units during the detection to improve efficiency. By introducing the sparse representation into our method, powerful classification ability is achieved after the dictionary training. To effectively describe a patch, the Histogram of Oriented Gradient (HOG) is used. We further propose to integrate color information to enrich the feature representation by using the color name feature. The final feature consists of both HOG and color name based histogram, by which we get a strong descriptor of a patch. Experimental results demonstrate the effectiveness and robust performance of the proposed algorithm for vehicle detection from aerial images.
Hyperspectral image has high-dimensional Spectral–spatial features, those features with some noisy and redundant information. Since redundant features can have significant adverse effect on learning performance. So efficient and robust feature selection methods are make the best of labeled and unlabeled points to extract meaningful features and eliminate noisy ones. On the other hand, obtaining sufficient accurate labeled data is either impossible or expensive. In order to take advantage of both precious labeled and unlabeled data points, in this paper, we propose a new semisupervised feature selection method, Firstly, we use labeled points are to enlarge the margin between data points from different classes; Secondly, we use unlabeled points to find the local structure of the data space; Finally, we compare our proposed algorithm with Fisher score, PCA and Laplacian score on HSI classification. Experimental results on benchmark hyperspectral data sets demonstrate the efficiency and effectiveness of our proposed algorithm.
The information of individual trees plays an important role in urban surveying and mapping. With the development of Light Detection and Ranging (LiDAR) technology, 3-Dimenisonal (3D) structure of trees can be generated in point clouds with high spatial resolution and accuracy. Individual tree segmentations are used to derive tree structural attributes such as tree height, crown diameter, stem position etc. In this study, a framework is proposed to take advantage of the detailed structures of tree crowns which are represented in the mobile laser scanning (MLS) data. This framework consists of five steps: (1) Automatically detect and remove ground points using RANSAC; (2) Compress all the above ground points to image grid with 3D knowledge reserved; (3) Simplify and remove unqualified grids; (4) Find tree peaks using a heuristic searching method; (5) Delineate the individual tree crowns by applying a modified watershed method. In an experiment on the point clouds on Xiamen Island, China, individual tree crowns from MLS point cloud data are successfully extracted.
Traffic signs are important roadway assets that provide valuable information of the road for drivers to make safer and easier driving behaviors. Due to the development of mobile mapping systems that can efficiently acquire dense point clouds along the road, automated detection and recognition of road assets has been an important research issue. This paper deals with the detection and classification of traffic signs in outdoor environments using mobile light detection and ranging (Li- DAR) and inertial navigation technologies. The proposed method contains two main steps. It starts with an initial detection of traffic signs based on the intensity attributes of point clouds, as the traffic signs are always painted with highly reflective materials. Then, the classification of traffic signs is achieved based on the geometric shape and the pairwise 3D shape context. Some results and performance analyses are provided to show the effectiveness and limits of the proposed method. The experimental results demonstrate the feasibility and effectiveness of the proposed method in detecting and classifying traffic signs from mobile LiDAR point clouds.
In order to perform high precise calibration of camera in complex background, a novel design of planar composite target and the corresponding automatic extraction algorithm are presented. Unlike other commonly used target designs, the proposed target contains the information of feature point coordinate and feature point serial number simultaneously. Then based on the original target, templates are prepared by three geometric transformations and used as the input of template matching based on shape context. Finally, parity check and region growing methods are used to extract the target as final result. The experimental results show that the proposed method for automatic extraction and recognition of the proposed target is effective, accurate and reliable.
In this paper, a new method of unsupervised change detection is proposed by modeling multi-scale change detector based on local mixed information and we present a method of automated threshold. A theoretical analysis is presented to demonstrate that more comprehensive information is taken into account by the integration of multi-scale information. The ROC curves show that change detector based on multi-scale mixed information(MSM) is more effective than based on mixed information(MIX). Experiments on artificial and real-world datasets indicate that the multi-scale change detection of mixed information can eliminate the pseudo-change part of the area. Therefore, the proposed algorithm MSM is an effective method for the application of change detection.
KEYWORDS: Clouds, Feature selection, Feature extraction, Cameras, Principal component analysis, Associative arrays, Data modeling, 3D modeling, Data processing, Data acquisition
Owing to complexity of indoor environment, such as close range, multi-angle, occlusion, uneven lighting conditions and lack of absolute positioning information, quality assessment of indoor mobile mapping point clouds is a tough and challenging task. It is meaningful to evaluate the features extracted from indoor point clouds prior to further quality assessment. In this paper, we mainly focus on feature extraction depend upon indoor RGB-D camera for the quality assessment of point cloud data, which is proposed for selecting and screening local features, using random forest algorithm to find the optimum feature for the next step’s quality assessment. First, we collect indoor point clouds data and classify them into classes of complete or incomplete. Then, we extract high dimensional features from the input point clouds data. Afterwards, we select discriminative features through random forest. Experimental results on different classes demonstrate the effective and promising performance of the presented method for point clouds quality assessment.
This paper proposed a 3d line segment based registration method for terrestrial laser scanning (TLS) data. The 3D line segment is adopted to describe the point cloud data and reduce geometric complexity. After that, we introduce a framework for registration. We demonstrate the accuracy of our method for rigid transformations in the presence of terrestrial laser scanning point cloud.
Detecting oil spill from open sea based on Synthetic Aperture Radar (SAR) image is a very important work. One of key issues is to distinguish oil spill from “look-alike”. There are many existing methods to tackle this issue including supervised and semi-supervised learning. Recent years have witnessed a surge of interest in hypergraph-based transductive classification. This paper proposes combinative hypergraph learning (CHL) to distinguish oil spill from “look-alike”. CHL captures the similarity between two samples in the same category by adding sparse hypergraph learning to conventional hypergraph learning. Experimental results have demonstrated the effectiveness of CHL in comparison to the state-of-the-art methods and showed that our proposed method is promising.
Here, we present a novel object of interest (OOI) extraction framework designed for low-frame-rate (LFR) image sequences, typically from mobile mapping systems (MMS). The proposed method integrates tracking and segmentation in a unified framework. We propose a novel object-shaped kernel-based scale-invariant mean shift algorithm to track the OOI through the LFR sequences and keep the temporal consistency. Then the well-known GrabCut approach for static image segmentation is generalized to the LFR sequences. We analyze the imaging geometry of the OOI in LFR sequences collected by the MMS and design a Kalman filter module to assist the proposed tracker. Extensive experimental results on real LFR sequences collected by VISAT™ MMS demonstrate that the proposed approach is robust to the challenges such as low frame rate, fast scaling, and large inter-frame displacement of the OOI.
We present a hybrid generative-discriminative learning method for human action recognition from video sequences. Our model combines a bag-of-words component with supervised latent topic models. A video sequence is represented as a collection of spatiotemporal words by extracting space-time interest points and describing these points using both shape and motion cues. The supervised latent Dirichlet allocation (sLDA) topic model, which employs discriminative learning using labeled data under a generative framework, is introduced to discover the latent topic structure that is most relevant to action categorization. The proposed algorithm retains most of the desirable properties of generative learning while increasing the classification performance though a discriminative setting. It has also been extended to exploit both labeled data and unlabeled data to learn human actions under a unified framework. We test our algorithm on three challenging data sets: the KTH human motion data set, the Weizmann human action data set, and a ballet data set. Our results are either comparable to or significantly better than previously published results on these data sets and reflect the promise of hybrid generative-discriminative learning approaches.
An unsupervised learning algorithm based on topic models is presented for lane detection in video sequences observed by uncalibrated moving cameras. Our contributions are twofold. First, we introduce the maximally stable extremal region (MSER) detector for lane-marking feature extraction and derive a novel shape descriptor in an affine invariant manner to describe region shapes and a modified scale-invariant feature transform descriptor to capture feature appearance characteristics. MSER features are more stable compared to edge points or line pairs and hence provide robustness to lane-marking variations in scale, lighting, viewpoint, and shadows. Second, we proposed a novel location-enhanced probabilistic latent semantic analysis (pLSA) topic model for simultaneous lane recognition and localization. The proposed model overcomes the limitation of a pLSA model for effective topic localization. Experimental results on traffic sequences in various scenarios demonstrate the effectiveness and robustness of the proposed method.
We present a novel unsupervised learning algorithm for discovering objects and their location in videos from moving cameras. The videos can switch between different shots, and contain cluttered background, occlusion, camera motion, and multiple independently moving objects. We exploit both appearance consistency and spatial configuration consistency of local patches across frames for object recognition and localization. The contributions of this paper are twofold. First, we propose a combined approach for simultaneous spatial context and temporal context generation. Local video patches are extracted and described using the generated spatial-temporal context words. Second, a dynamic topic model, based on the representation of a bag of spatial-temporal context words, is introduced to learn object category models in video sequences. The proposed model can categorize and localize multiple objects in a single video. Objects leaving or entering the scene at multiple times can also be handled efficiently in the dynamic framework. Experimental results on the CamVid data set and the VISAT™ data set demonstrate the effectiveness and robustness of the proposed method.
A scale-invariant feature transform (SIFT)-based particle filter algorithm is presented for joint detection and tracking of independently moving objects in stereo sequences observed by uncalibrated moving cameras. The major steps include feature detection and matching, moving object detection based on multiview geometric constraints, and tracking based on particle filter. Our contributions are first, a novel closed-loop mapping (CLM) multiview matching scheme proposed for stereo matching and motion tracking. CLM outperforms several state-of-the-art SIFT matching methods in terms of density and reliability of feature correspondences. Our second contribution is a multiview epipolar constraint derived from the relative camera positions in pairs of consecutive stereo views for independent motion detection. The multiview epipolar constraint is able to detect moving objects followed by moving cameras in the same direction, a configuration where the epipolar constraint fails. Our third contribution is a proposed dimensional variable particle filter for joint detection and tracking of independently moving objects. Multiple moving objects entering or leaving the field of view are handled effectively within the proposed framework. Experimental results on real-world stereo sequences demonstrate the effectiveness and robustness of our method.
This paper presents a line extraction algorithm in SAR (Synthetic Aperture Radar) images. The algorithm is designed based on the statistical characteristics of the speckle in SAR image. Three steps are involved. Firstly, a new edge detector, which combines the Canny operator and Ratio operator, is used to detect the edge points and calculate their directions, then the edge points are grouped according to their edge direction to form the initial lines. Finally, a high-level grouping step connects the fragmental lines. The proposed new edge operator is CFAR (Constant False Alert Rate) and prevents the line from cleavage. The algorithm has been applied in the X-band airborne SAR images, and the results are presented at the end of this paper.
Due to the complex signal-dependent nature of the speckle in SAR image, it is more reasonable to use different speckle descriptions and despeckling filters for different kinds of regions. A multi-description despeckling approach is presented in this paper. Based on the local statistics, the x2 test is introduced to segment the SAR image into the Gamma-distributed homogeneous regions and the more fluctuant heterogeneous regions. Then a MAP filter is used in the homogeneous regions, and a modified median filter is utilized in the heterogeneous regions. X-band airborne SAR image and synthetic image are used for illustration and comparison.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.