PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 10615, including the Title Page, Copyright information, Table of Contents, Introduction (if any), and Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a model for airport detection using region-based fully convolutional neural networks. To achieve fast detection with high accuracy, we shared the conv layers between the region proposal procedure and the airport detection procedure and used graphics processing units (GPUs) to speed up the training and testing time. For lack of labeled data, we transferred the convolutional layers of ZF net pretrained by ImageNet to initialize the shared convolutional layers, then we retrained the model using the alternating optimization training strategy. The proposed model has been tested on an airport dataset consisting of 600 images. Experiments show that the proposed method can distinguish airports in our dataset from similar background scenes almost real-time with high accuracy, which is much better than traditional methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Effectively and accurately locating the positions of pedestrian candidates in image is a key task for the infrared pedestrian detection system. In this work, a novel similarity measuring metric is designed. Based on the selective search scheme, the developed similarity measuring metric is utilized to yield the possible locations for pedestrian candidate. Besides this, corresponding diversification strategies are also provided according to the characteristics of the infrared thermal imaging system. Experimental results indicate that the presented scheme can achieve more efficient outputs than the traditional selective search methodology for the infrared pedestrian detection task.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object tracking in video sequences has broad applications in both military and civilian domains. However, as the length of input video sequence increases, a number of problems arise, such as severe object occlusion, object appearance variation, and object out-of-view (some portion or the entire object leaves the image space). To deal with these problems and identify the object being tracked from cluttered background, we present a robust appearance model using Speeded Up Robust Features (SURF) and advanced integrated features consisting of the Felzenszwalb's Histogram of Oriented Gradients (FHOG) and color attributes. Since re-detection is essential in long-term tracking, we develop an effective object re-detection strategy based on moving area detection. We employ the popular kernel correlation filters in our algorithm design, which facilitates high-speed object tracking. Our evaluation using the CVPR2013 Object Tracking Benchmark (OTB2013) dataset illustrates that the proposed algorithm outperforms reference state-of-the-art trackers in various challenging scenarios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, fuzzy-based vehicle tracking system is proposed. The proposed system consists of two main processes: vehicle detection and vehicle tracking. In the first process, the Gradient-based Adaptive Threshold Estimation (GATE) algorithm is adopted to provide the suitable threshold value for the sobel edge detection. The estimated threshold can be adapted to the changes of diverse illumination conditions throughout the day. This leads to greater vehicle detection performance compared to a fixed user’s defined threshold. In the second process, this paper proposes the novel vehicle tracking algorithms namely Fuzzy-based Vehicle Analysis (FBA) in order to reduce the false estimation of the vehicle tracking caused by uneven edges of the large vehicles and vehicle changing lanes. The proposed FBA algorithm employs the average edge density and the Horizontal Moving Edge Detection (HMED) algorithm to alleviate those problems by adopting fuzzy rule-based algorithms to rectify the vehicle tracking. The experimental results demonstrate that the proposed system provides the high accuracy of vehicle detection about 98.22%. In addition, it also offers the low false detection rates about 3.92%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To cope with the rapid development of the real applications for infrared small targets, the researchers have tried their best to pursue more robust detection methods. At present, the contrast measure-based method has become a promising research branch. Following the framework, in this paper, a speeded-up contrast measure scheme is proposed based on the saliency detection and density clustering. First, the saliency region is segmented by saliency detection method, and then, the Multi-scale contrast calculation is carried out on it instead of traversing the whole image. Second, the target with a certain “integrity” property in spatial is exploited to distinguish the target from the isolated noises by density clustering. Finally, the targets are detected by a self-adaptation threshold. Compared with time-consuming MPCM (Multiscale Patch Contrast Map), the time cost of the speeded-up version is within a few seconds. Additional, due to the use of “clustering segmentation”, the false alarm caused by heavy noises can be restrained to a lower level. The experiments show that our method has a satisfied FASR (False alarm suppression ratio) and real-time performance compared with the state-of-art algorithms no matter in cloudy sky or sea-sky background.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fabric defect detection plays an important role in improving the quality of fabric product. In this paper, a novel fabric defect detection method based on visual saliency using deep feature and low-rank recovery was proposed. First, unsupervised training is carried out by the initial network parameters based on MNIST large datasets. The supervised fine-tuning of fabric image library based on Convolutional Neural Networks (CNNs) is implemented, and then more accurate deep neural network model is generated. Second, the fabric images are uniformly divided into the image block with the same size, then we extract their multi-layer deep features using the trained deep network. Thereafter, all the extracted features are concentrated into a feature matrix. Third, low-rank matrix recovery is adopted to divide the feature matrix into the low-rank matrix which indicates the background and the sparse matrix which indicates the salient defect. In the end, the iterative optimal threshold segmentation algorithm is utilized to segment the saliency maps generated by the sparse matrix to locate the fabric defect area. Experimental results demonstrate that the feature extracted by CNN is more suitable for characterizing the fabric texture than the traditional LBP, HOG and other hand-crafted features extraction method, and the proposed method can accurately detect the defect regions of various fabric defects, even for the image with complex texture.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the vehicle driver assistance systems, the accuracy and speed of lane line detection are the most important. This paper is based on color probability model and Fuzzy Local Information C-Means (FLICM) clustering algorithm. The Hough transform and the constraints of structural road are used to detect the lane line accurately. The global map of the lane line is drawn by the lane curve fitting equation. The experimental results show that the algorithm has good robustness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the field of computer vision, object classification and object detection are widely used in many fields. The traditional object detection have two main problems:one is that sliding window of the regional selection strategy is high time complexity and have window redundancy. And the other one is that Robustness of the feature is not well. In order to solve those problems, Regional Proposal Network (RPN) is used to select candidate regions instead of selective search algorithm. Compared with traditional algorithms and selective search algorithms, RPN has higher efficiency and accuracy. We combine HOG feature and convolution neural network (CNN) to extract features. And we use SVM to classify. For TorontoNet, our algorithm's mAP is 1.6 percentage points higher. For OxfordNet, our algorithm's mAP is 1.3 percentage higher.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to effectively detect the defects for fabric image with complex texture, this paper proposed a novel detection algorithm based on an end-to-end convolutional neural network. First, the proposal regions are generated by RPN (regional proposal Network). Then, Fast Region-based Convolutional Network method (Fast R-CNN) is adopted to determine whether the proposal regions extracted by RPN is a defect or not. Finally, Soft-NMS (non-maximum suppression) and data augmentation strategies are utilized to improve the detection precision. Experimental results demonstrate that the proposed method can locate the fabric defect region with higher accuracy compared with the state-of- art, and has better adaptability to all kinds of the fabric image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We investigate a new approach for improving localization accuracy of detected vehicles for object detection in advanced driver assistance systems(ADAS). Specifically, we implement a bounding box refinement as a post-processing of the state-of-the-art object detectors (Faster R-CNN, YOLOv2, etc.). The bounding box refinement is achieved by individually adjusting each border of the detected bounding box to its target location using a regression method. We use HOG features which perform well on the edge detection of vehicles to train the regressor and the regressor is independent of the CNN-based object detectors. Experiment results on the KITTI 2012 benchmark show that we can achieve up to 6% improvements over YOLOv2 and Faster R-CNN object detectors on the IoU threshold of 0.8. Also, the proposed refinement framework is computationally light, allowing for processing one bounding box within a few milliseconds on CPU. Further, this refinement method can be added to any object detectors, especially those with high speed but less accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this study, we propose a pixel correspondence algorithm for positioning in crowds based on constraints on the distance between lines of sight, grayscale differences, and height in a world coordinates system. First, a Gaussian mixture model is used to obtain the background and foreground from multi-camera videos. Second, the hair and skin regions are extracted as regions of interest. Finally, the correspondences between each pixel in the region of interest are found under multiple constraints and the targets are positioned by pixel clustering. The algorithm can provide appropriate redundancy information for each target, which decreases the risk of losing targets due to a large viewing angle and wide baseline. To address the correspondence problem for multiple pixels, we construct a pixel-based correspondence model based on a similar permutation matrix, which converts the correspondence problem into a linear programming problem where a similar permutation matrix is found by minimizing an objective function. The correct pixel correspondences can be obtained by determining the optimal solution of this linear programming problem and the three-dimensional position of the targets can also be obtained by pixel clustering. Finally, we verified the algorithm with multiple cameras in experiments, which showed that the algorithm has high accuracy and robustness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Person detection, tracking and following is a key enabling technology for mobile robots in many human–robot interaction applications. In this article, we present a system which is composed of visual human detection, video tracking and following. The detection is based on YOLO(You only look once), which applies a single convolution neural network(CNN) to the full image, thus can predict bounding boxes and class probabilities directly in one evaluation. Then the bounding box provides initial person position in image to initialize and train the KCF(Kernelized Correlation Filter), which is a video tracker based on discriminative classifier. At last, by using a stereo 3D sparse reconstruction algorithm, not only the position of the person in the scene is determined, but also it can elegantly solve the problem of scale ambiguity in the video tracker. Extensive experiments are conducted to demonstrate the effectiveness and robustness of our human detection and tracking system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Robust object tracking is a challenging task in computer vision due to interruptions such as deformation, fast motion and especially, occlusion of tracked object. When occlusions occur, image data will be unreliable and is insufficient for the tracker to depict the object of interest. Therefore, most trackers are prone to fail under occlusion. In this paper, an occlusion judgement and handling method based on segmentation of the target is proposed. If the target is occluded, the speed and direction of it must be different from the objects occluding it. Hence, the value of motion features are emphasized. Considering the efficiency and robustness of Kernelized Correlation Filter Tracking (KCF), it is adopted as a pre-tracker to obtain a predicted position of the target. By analyzing long-term motion cues of objects around this position, the tracked object is labelled. Hence, occlusion could be detected easily. Experimental results suggest that our tracker achieves a favorable performance and effectively handles occlusion and drifting problems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The existing salient object detection model can only detect the approximate location of salient object, or highlight the background, to resolve the above problem, a salient object detection method was proposed based on image semantic features. First of all, three novel salient features were presented in this paper, including object edge density feature (EF), object semantic feature based on the convex hull (CF) and object lightness contrast feature (LF). Secondly, the multiple salient features were trained with random detection windows. Thirdly, Naive Bayesian model was used for combine these features for salient detection. The results on public datasets showed that our method performed well, the location of salient object can be fixed and the salient object can be accurately detected and marked by the specific window.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The detection of the characters in the natural scene is susceptible to factors such as complex background, variable viewing angle and diverse forms of language, which leads to poor detection results. Aiming at these problems, a new text detection method was proposed, which consisted of two main stages, candidate region extraction and text region detection. At first stage, the method used multiple scale transformations of original image and multiple thresholds of maximally stable extremal regions (MSER) to detect the text regions which could detect character regions comprehensively. At second stage, obtained SWT maps by using the stroke width transform (SWT) algorithm to compute the candidate regions, then using cascaded classifiers to propose non-text regions. The proposed method was evaluated on the standard benchmark datasets of ICDAR2011 and the datasets that we made our own data sets. The experiment results showed that the proposed method have greatly improved that compared to other text detection methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to resolve tracking failure resulted from target’s being occlusion and follower jamming caused by objects similar to target in the background, reduce the influence of light intensity. This paper change HSV and YCbCr color channel correction the update center of the target, continuously updated image threshold self-adaptive target detection effect, Clustering the initial obstacles is roughly range, shorten the threshold range, maximum to detect the target. In order to improve the accuracy of detector, this paper increased the Kalman filter to estimate the target state area. The direction predictor based on the Markov model is added to realize the target state estimation under the condition of background color interference and enhance the ability of the detector to identify similar objects. The experimental results show that the improved algorithm more accurate and faster speed of processing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Infrared(IR) small target detection plays a critical role in the Infrared Search And Track (IRST) system. Although it has been studied for years, there are some difficulties remained to the clutter environment. According to the principle of human discrimination of small targets from a natural scene that there is a signature of discontinuity between the object and its neighboring regions, we develop an efficient method for infrared small target detection called multiscale centersurround contrast measure (MCSCM). First, to determine the maximum neighboring window size, an entropy-based window selection technique is used. Then, we construct a novel multiscale center-surround contrast measure to calculate the saliency map. Compared with the original image, the MCSCM map has less background clutters and noise residual. Subsequently, a simple threshold is used to segment the target. Experimental results show our method achieves better performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Pedestrian detection is a canonical sub-problem of object detection with high demand during recent years. Although recent deep learning object detectors such as Fast/Faster R-CNN have shown excellent performance for general object detection, they have limited success for small size pedestrian detection in large-view scene. We study that the insufficient resolution of feature maps lead to the unsatisfactory accuracy when handling small instances. In this paper, we investigate issues involving Fast R-CNN for pedestrian detection. Driven by the observations, we propose a very simple but effective baseline for pedestrian detection based on Fast R-CNN, employing the DPM detector to generate proposals for accuracy, and training a fast R-CNN style network to jointly optimize small size pedestrian detection with skip connection concatenating feature from different layers to solving coarseness of feature maps. And the accuracy is improved in our research for small size pedestrian detection in the real large scene.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a new convolution neural network method is proposed for the inspection and classification of galvanized stamping parts. Firstly, all workpieces are divided into normal and defective by image processing, and then the defective workpieces extracted from the region of interest (ROI) area are input to the trained fully convolutional networks (FCN). The network utilizes an end-to-end and pixel-to-pixel training convolution network that is currently the most advanced technology in semantic segmentation, predicts result of each pixel. Secondly, we mark the different pixel values of the workpiece, defect and background for the training image, and use the pixel value and the number of pixels to realize the recognition of the defects of the output picture. Finally, the defect area’s threshold depended on the needs of the project is set to achieve the specific classification of the workpiece. The experiment results show that the proposed method can successfully achieve defect detection and classification of galvanized stamping parts under ordinary camera and illumination conditions, and its accuracy can reach 99.6%. Moreover, it overcomes the problem of complex image preprocessing and difficult feature extraction and performs better adaptability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As the development of surveillance in public, person re-identification becomes more and more important. The largescale databases call for efficient computation and storage, hashing technique is one of the most important methods. In this paper, we proposed a new deep classification hashing network by introducing a new binary appropriation layer in the traditional ImageNet pre-trained CNN models. It outputs binary appropriate features, which can be easily quantized into binary hash-codes for hamming similarity comparison. Experiments show that our deep hashing method can outperform the state-of-the-art methods on the public CUHK03 and Market1501 datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Facial expression recognition under partial occlusion is a challenging research. This paper proposes a novel framework for facial expression recognition under occlusion by fusing the global and local features. In global aspect, first, information entropy are employed to locate the occluded region. Second, principal Component Analysis (PCA) method is adopted to reconstruct the occlusion region of image. After that, a replace strategy is applied to reconstruct image by replacing the occluded region with the corresponding region of the best matched image in training set, Pyramid Weber Local Descriptor (PWLD) feature is then extracted. At last, the outputs of SVM are fitted to the probabilities of the target class by using sigmoid function. For the local aspect, an overlapping block-based method is adopted to extract WLD features, and each block is weighted adaptively by information entropy, Chi-square distance and similar block summation methods are then applied to obtain the probabilities which emotion belongs to. Finally, fusion at the decision level is employed for the data fusion of the global and local features based on Dempster-Shafer theory of evidence. Experimental results on the Cohn-Kanade and JAFFE databases demonstrate the effectiveness and fault tolerance of this method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper investigates a face recognition approach based on Scale Invariant Feature Transform (SIFT) feature and sparse representation. The approach takes advantage of SIFT which is local feature other than holistic feature in classical Sparse Representation based Classification (SRC) algorithm and possesses strong robustness to expression, pose and illumination variations. Since hexagonal image has more inherit merits than square image to make recognition process more efficient, we extract SIFT keypoint in hexagonal-sampling image. Instead of matching SIFT feature, firstly the sparse representation of each SIFT keypoint is given according the constructed dictionary; secondly these sparse vectors are quantized according dictionary; finally each face image is represented by a histogram and these so-called Bag-of-Words vectors are classified by SVM. Due to use of local feature, the proposed method achieves better result even when the number of training sample is small. In the experiments, the proposed method gave higher face recognition rather than other methods in ORL and Yale B face databases; also, the effectiveness of the hexagonal-sampling in the proposed method is verified.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present a Sub-pattern based Multi-manifold Discriminant Analysis (SpMMDA) algorithm for face recognition. Unlike existing Multi-manifold Discriminant Analysis (MMDA) approach which is based on holistic information of face image for recognition, SpMMDA operates on sub-images partitioned from the original face image and then extracts the discriminative local feature from the sub-images separately. Moreover, the structure information of different sub-images from the same face image is considered in the proposed method with the aim of further improve the recognition performance. Extensive experiments on three standard face databases (Extended YaleB, CMU PIE and AR) demonstrate that the proposed method is effective and outperforms some other sub-pattern based face recognition methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the computer research area, facial expression recognition is a hot research problem. Recent years, the research has moved from the lab environment to in-the-wild circumstances. It is challenging, especially under extreme poses. But current expression detection systems are trying to avoid the pose effects and gain the general applicable ability. In this work, we solve the problem in the opposite approach. We consider the head poses and detect the expressions within special head poses. Our work includes two parts: detect the head pose and group it into one pre-defined head pose class; do facial expression recognize within each pose class. Our experiments show that the recognition results with pose class grouping are much better than that of direct recognition without considering poses. We combine the hand-crafted features, SIFT, LBP and geometric feature, with deep learning feature as the representation of the expressions. The handcrafted features are added into the deep learning framework along with the high level deep learning features. As a comparison, we implement SVM and random forest to as the prediction models. To train and test our methodology, we labeled the face dataset with 6 basic expressions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Person re-identification is a fundamental and inevitable task in public security. In this paper, we propose a novel framework to improve the performance of this task. First, two different types of descriptors are extracted to represent a pedestrian: (1) appearance-based superpixel features, which are constituted mainly by conventional color features and extracted from the supepixel rather than a whole picture and (2) due to the limitation of discrimination of appearance features, the deep features extracted by feature fusion Network are also used. Second, a view invariant subspace is learned by dictionary learning constrained by the minimum negative sample (termed as DL-cMN) to reduce the noise in appearance-based superpixel feature domain. Then, we use deep features and sparse codes transformed by appearancebased features to establish the hyperedges respectively by k-nearest neighbor, rather than jointing different features simply. Finally, a final ranking is performed by probabilistic hypergraph ranking algorithm. Extensive experiments on three challenging datasets (VIPeR, PRID450S and CUHK01) demonstrate the advantages and effectiveness of our proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The advancement of embedded system for detecting and preventing drowsiness in a vehicle is a major challenge for road traffic accident systems. To prevent drowsiness while driving, it is necessary to have an alert system that can detect a decline in driver concentration and send a signal to the driver. Studies have shown that traffic accidents usually occur when the driver is distracted while driving. In this paper, we have reviewed a number of detection systems to monitor the concentration of a car driver and propose a portable Driver Alertness Detection System (DADS) to determine the level of concentration of the driver based on pixelated coloration detection technique using facial recognition. A portable camera will be placed at the front visor to capture facial expression and the eye activities. We evaluate DADS using 26 participants and have achieved 100% detection rate with good lighting condition and a low detection rate at night.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we proposed a joint and collaborative representation with Volterra kernel convolution feature (JCRVK) for face recognition. Firstly, the candidate face images are divided into sub-blocks in the equal size. The blocks are extracted feature using the two-dimensional Voltera kernels discriminant analysis, which can better capture the discrimination information from the different faces. Next, the proposed joint and collaborative representation is employed to optimize and classify the local Volterra kernels features (JCR-VK) individually. JCR-VK is very efficiently for its implementation only depending on matrix multiplication. Finally, recognition is completed by using the majority voting principle. Extensive experiments on the Extended Yale B and AR face databases are conducted, and the results show that the proposed approach can outperform other recently presented similar dictionary algorithms on recognition accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, dynamic facial expression recognition in videos has attracted growing attention. In this paper, we propose a novel dynamic facial expression recognition method by using geometric and texture features. In our system, the facial landmark movements and texture variations upon pairwise images are used to perform the dynamic facial expression recognition tasks. For one facial expression sequence, pairwise images are created between the first frame and each of its subsequent frames. Integration of both geometric and texture features further enhances the representation of the facial expressions. Finally, Support Vector Machine is used for facial expression recognition. Experiments conducted on the extended Cohn-Kanade database show that our proposed method can achieve a competitive performance with other methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The estimation of depth is virtual important in 3D face reconstruction. In this paper, we propose a t-SNE based on manifold learning constraints and introduce K-means method to divide the original database into several subset, and the selected optimal subset to reconstruct the 3D face depth information can greatly reduce the computational complexity. Firstly, we carry out the t-SNE operation to reduce the key feature points in each 3D face model from 1×249 to 1×2. Secondly, the K-means method is applied to divide the training 3D database into several subset. Thirdly, the Euclidean distance between the 83 feature points of the image to be estimated and the feature point information before the dimension reduction of each cluster center is calculated. The category of the image to be estimated is judged according to the minimum Euclidean distance. Finally, the method Kong D will be applied only in the optimal subset to estimate the depth value information of 83 feature points of 2D face images. Achieving the final depth estimation results, thus the computational complexity is greatly reduced. Compared with the traditional traversal search estimation method, although the proposed method error rate is reduced by 0.49, the number of searches decreases with the change of the category. In order to validate our approach, we use a public database to mimic the task of estimating the depth of face images from 2D images. The average number of searches decreased by 83.19%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, the powerful feature learning and classification ability of convolutional neural network have attracted widely attention. Compared with the deep learning, the traditional machine learning algorithm has a good explanatory which deep learning does not have. Thus, In this paper, we propose a method to extract the feature of the traditional algorithm as the input of convolution neural network. In order to reduce the complexity of the network, the kernel function of Gabor wavelet is used to extract the feature from different position, frequency and direction of target image. It is sensitive to edge of image which can provide good direction and scale selection. The extraction of the image from eight directions on a scale are as the input of network that we proposed. The network have the advantage of weight sharing and local connection and texture feature of the input image can reduce the influence of facial expression, gesture and illumination. At the same time, we introduced a layer which combined the results of the pooling and convolution can extract deeper features. The training network used the open source caffe framework which is beneficial to feature extraction. The experiment results of the proposed method proved that the network structure effectively overcame the barrier of illumination and had a good robustness as well as more accurate and rapid than the traditional algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Human action recognition is an important and challenging task in computer vision research, due to the variations in human motion performance, interpersonal differences and recording settings. In this paper, we propose a novel multi-task learning framework with group information (MTL-GI) for accurate and efficient human action recognition. Specifically, we firstly obtain group information through calculating the mutual information according to the latent relationship between Gaussian components and action categories, and clustering similar action categories into the same group by affinity propagation clustering. Additionally, in order to explore the relationships of related tasks, we incorporate group information into multi-task learning. Experimental results evaluated on two popular benchmarks (UCF50 and HMDB51 datasets) demonstrate the superiority of our proposed MTL-GI framework.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to improve the recognition rate of various postures, this paper proposes a method of facial correction based on Gaussian Process which build a nonlinear regression model between the front and the side face with combined kernel function. The face images with horizontal angle from -45° to +45° can be properly corrected to front faces. Finally, Support Vector Machine is employed for face recognition. Experiments on CAS PEAL R1 face database show that Gaussian process can weaken the influence of pose changes and improve the accuracy of face recognition to certain extent.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Quantitative and statistical analysis of ocean creatures is critical to ecological and environmental studies. And living fish recognition is one of the most essential requirements for fishery industry. However, light attenuation and scattering phenomenon are present in the underwater environment, which makes underwater images low-contrast and blurry. This paper tries to design a robust framework for accurate fish recognition. The framework introduces a two stage PCA Network to extract abstract features from fish images. On a real-world fish recognition dataset, we use a linear SVM classifier and set penalty coefficients to conquer data unbalanced issue. Feature visualization results show that our method can avoid the feature distortion in boundary regions of underwater image. Experiments results show that the PCA Network can extract discriminate features and achieve promising recognition accuracy. The framework improves the recognition accuracy of underwater living fishes and can be easily applied to marine fishery industry.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Partial fingerprint identification technology which is mainly used in device with small sensor area like cellphone, U disk and computer, has taken more attention in recent years with its unique advantages. However, owing to the lack of sufficient minutiae points, the conventional method do not perform well in the above situation. We propose a new fingerprint matching technique which utilizes ridges as features to deal with partial fingerprint images and combines the modified generalized Hough transform and scoring strategy based on machine learning. The algorithm can effectively meet the real-time and space-saving requirements of the resource constrained devices. Experiments on in-house database indicate that the proposed algorithm have an excellent performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An effective method based on improved atmospheric scattering model is proposed in this paper to handle the problem of the vehicle license plate location and recognition in dense fog. Dense fog detection is performed firstly by the top-hat transformation and the vertical edge detection, and the moving vehicle image is separated from the traffic video image. After the vehicle image is decomposed into two layers: structure and texture layers, the glow layer is separated from the structure layer to get the background layer. Followed by performing the mean-pooling and the bicubic interpolation algorithm, the atmospheric light map of the background layer can be predicted, meanwhile the transmission of the background layer is estimated through the grayed glow layer, whose gray value is altered by linear mapping. Then, according to the improved atmospheric scattering model, the final restored image can be obtained by fusing the restored background layer and the optimized texture layer. License plate location is performed secondly by a series of morphological operations, connected domain analysis and various validations. Characters extraction is achieved according to the projection. Finally, an offline trained pattern classifier of hybrid discriminative restricted boltzmann machines (HDRBM) is applied to recognize the characters. Experimental results on thorough data sets are reported to demonstrate that the proposed method can achieve high recognition accuracy and works robustly in the dense fog traffic environment during 24h or one day.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the digital protection of the cultural relics, the identification of the pigment mixtures on the surface of the painting has been the research spot for many years. In this paper, as a hyperspectral unmixing algorithm, sub-space distance unmixing is introduced to solve the problem of recognition of pigments mixture in paintings. Firstly, some mixtures of different pigments are designed to measure their reflectance spectra using spectrometer. Moreover, the factors affecting the unmixing accuracy of pigments’ mixtures are discussed. The unmixing results of two cases with and without rice paper and its underlay as endmembers are compared. The experiment results show that the algorithm is able to unmixing the pigments effectively and the unmixing accuracy can be improved after considering the influence of spectra of the rich paper and the underlaying material.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A great number of associative memory models have been proposed to realize information storage and retrieval inspired by human brain in the last few years. However, there is still much room for improvement for those models. In this paper, we extend a binary pattern associative memory model to accomplish real-world image recognition. The learning process is based on the fundamental Hebb rules and the retrieval is implemented by a normalized dot product operation. Our proposed model can not only fulfill rapid memory storage and retrieval for visual information but also have the ability on incremental learning without destroying the previous learned information. Experimental results demonstrate that our model outperforms the existing Self-Organizing Incremental Neural Network (SOINN) and Back Propagation Neuron Network (BPNN) on recognition accuracy and time efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Vehicle logo recognition plays an important role in manufacturer identification and vehicle recognition. This paper proposes a new vehicle logo recognition algorithm. It has a hierarchical framework, which consists of two fusion levels. At the first level, a feature fusion model is employed to map the original features to a higher dimension feature space. In this space, the vehicle logos become more recognizable. At the second level, a weighted voting strategy is proposed to promote the accuracy and the robustness of the recognition results. To evaluate the performance of the proposed algorithm, extensive experiments are performed, which demonstrate that the proposed algorithm can achieve high recognition accuracy and work robustly.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Scene recognition is a significant topic in the field of computer vision. Most of the existing scene recognition models require a large amount of labeled training samples to achieve a good performance. However, labeling image manually is a time consuming task and often unrealistic in practice. In order to gain satisfying recognition results when labeled samples are insufficient, this paper proposed a scene recognition algorithm named Integrating Active Learning and Dictionary Leaning (IALDL). IALDL adopts projective dictionary pair learning (DPL) as classifier and introduces active learning mechanism into DPL for improving its performance. When constructing sampling criterion in active learning, IALDL considers both the uncertainty and representativeness as the sampling criteria to effectively select the useful unlabeled samples from a given sample set for expanding the training dataset. Experiment results on three standard databases demonstrate the feasibility and validity of the proposed IALDL.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hierarchical matching pursuit (HMP) is a popular feature learning method for RGB-D object recognition. However, the feature representation with only one dictionary for RGB channels in HMP does not capture sufficient visual information. In this paper, we propose multi-channel feature dictionaries based feature learning method for RGB-D object recognition. The process of feature extraction in the proposed method consists of two layers. The K-SVD algorithm is used to learn dictionaries in sparse coding of these two layers. In the first-layer, we obtain features by performing max pooling on sparse codes of pixels in a cell. And the obtained features of cells in a patch are concatenated to generate patch jointly features. Then, patch jointly features in the first-layer are used to learn the dictionary and sparse codes in the second-layer. Finally, spatial pyramid pooling can be applied to the patch jointly features of any layer to generate the final object features in our method. Experimental results show that our method with first or second-layer features can obtain a comparable or better performance than some published state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Identification of serial number on bank card has many applications. Due to the different number printing mode, complex background, distortion in shape, etc., it is quite challenging to achieve high identification accuracy. In this paper, we propose a method using Normalization-Cooperated Gradient Feature (NCGF) and Recurrent Neural Network (RNN) based on Long Short-Term Memory (LSTM) for serial number identification. The NCGF maps the gradient direction elements of original image to direction planes such that the RNN with direction planes as input can recognize numbers more accurately. Taking the advantages of NCGF and RNN, we get 90%digit string recognition accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object recognition in images suffered from huge search space and uncertain object profile. Recently, the Bag-of- Words methods are utilized to solve these problems, especially the 2-dimension CRF(Conditional Random Field) model. In this paper we suggest the method based on a general and flexible fact graph model, which can catch the long-range correlation in Bag-of-Words by constructing a network learning framework contrasted from lattice in CRF. Furthermore, we explore a parameter learning algorithm based on the gradient descent and Loopy Sum-Product algorithms for the factor graph model. Experimental results on Graz 02 dataset show that, the recognition performance of our method in precision and recall is better than a state-of-art method and the original CRF model, demonstrating the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Liver disease is one of the main causes of human healthy problem. Cirrhosis, of course, is the critical phase during the development of liver lesion, especially the hepatoma. Many clinical cases are still influenced by the subjectivity of physicians in some degree, and some objective factors such as illumination, scale, edge blurring will affect the judgment of clinicians. Then the subjectivity will affect the accuracy of diagnosis and the treatment of patients. In order to solve the difficulty above and improve the recognition rate of liver cirrhosis, we propose a method of multi-feature fusion to obtain more robust representations of texture in ultrasound liver images, the texture features we extract include local binary pattern(LBP), gray level co-occurrence matrix(GLCM) and histogram of oriented gradient(HOG). In this paper, we firstly make a fusion of multi-feature to recognize cirrhosis and normal liver based on parallel combination concept, and the experimental results shows that the classifier is effective for cirrhosis recognition which is evaluated by the satisfying classification rate, sensitivity and specificity of receiver operating characteristic(ROC), and cost time. Through the method we proposed, it will be helpful to improve the accuracy of diagnosis of cirrhosis and prevent the development of liver lesion towards hepatoma.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, authors propose a mutual information based method for lung CT image retrieval. This method is designed to adapt to different datasets and different retrieval task. For practical applying consideration, this method avoids using a large amount of training data. Instead, with a well-designed training process and robust fundamental features and measurements, the method in this paper can get promising performance and maintain economic training computation. Experimental results show that the method has potential practical values for clinical routine application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel local descriptor is proposed in this paper. The difference between the values of each bin across two scales are calculated, and sorted in descending order. The index number of a bin in the sorted list is a measure of the stability across two scales. All the index numbers of a bin are accumulated to produce the accumulated ranking of the bin, which is a measure of the stability across all scales. The accumulated ranking forms the first half part of the descriptor. The averaged bin value across multiple scales is calculated as the second half part of the descriptor. Experiments on Fischer dataset and Oxford dataset demonstrate the effectiveness of the proposed descriptor and its superiority to the state-of-the-art descriptors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a local digital image watermarking method based on Robust Feature Extraction. The segmentation is achieved by Simple Linear Iterative Clustering (SLIC) based on which an Image Segmentation-based Robust Feature Extraction (ISRFE) method is proposed for feature extraction. Our method can adaptively extract feature regions from the blocks segmented by SLIC. This novel method can extract the most robust feature region in every segmented image. Each feature region is decomposed into low-frequency domain and high-frequency domain by Discrete Cosine Transform (DCT). Watermark images are then embedded into the coefficients in the low-frequency domain. The Distortion-Compensated Dither Modulation (DC-DM) algorithm is chosen as the quantization method for embedding. The experimental results indicate that the method has good performance under various attacks. Furthermore, the proposed method can obtain a trade-off between high robustness and good image quality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. The challenge for action recognition is to capture and fuse the multi-dimension information in video data. In order to take into account these characteristics simultaneously, we present a novel method that fuses multiple dimensional features, such as chromatic images, depth and optical flow fields. We built our model based on the multi-stream deep convolutional networks with the help of temporal segment networks and extract discriminative spatial and temporal features by fusing ConvNets towers multi-dimension, in which different feature weights are assigned in order to take full advantage of this multi-dimension information. Our architecture is trained and evaluated on the currently largest and most challenging benchmark NTU RGB-D dataset. The experiments demonstrate that the performance of our method outperforms the state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Generating description for an image can be regard as visual understanding. It is across artificial intelligence, machine learning, natural language processing and many other areas. In this paper, we present a model that generates description for images based on RNN (recurrent neural network) with object attention and multi-feature of images. The deep recurrent neural networks have excellent performance in machine translation, so we use it to generate natural sentence description for images. The proposed method uses single CNN (convolution neural network) that is trained on ImageNet to extract image features. But we think it can not adequately contain the content in images, it may only focus on the object area of image. So we add scene information to image feature using CNN which is trained on Places205. Experiments show that model with multi-feature extracted by two CNNs perform better than which with a single feature. In addition, we make saliency weights on images to emphasize the salient objects in images. We evaluate our model on MSCOCO based on public metrics, and the results show that our model performs better than several state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Small objects detection is a challenging task in computer vision due to its limited resolution and information. In order to solve this problem, the majority of existing methods sacrifice speed for improvement in accuracy. In this paper, we aim to detect small objects at a fast speed, using the best object detector Single Shot Multibox Detector (SSD) with respect to accuracy-vs-speed trade-off as base architecture. We propose a multi-level feature fusion method for introducing contextual information in SSD, in order to improve the accuracy for small objects. In detailed fusion operation, we design two feature fusion modules, concatenation module and element-sum module, different in the way of adding contextual information. Experimental results show that these two fusion modules obtain higher mAP on PASCAL VOC2007 than baseline SSD by 1.6 and 1.7 points respectively, especially with 2-3 points improvement on some small objects categories. The testing speed of them is 43 and 40 FPS respectively, superior to the state of the art Deconvolutional single shot detector (DSSD) by 29.4 and 26.4 FPS.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Most of the traditional biometric recognition systems perform recognition with a single biometric indicator. These systems have suffered noisy data, interclass variations, unacceptable error rates, forged identity, and so on. Due to these inherent problems, it is not valid that many researchers attempt to enhance the performance of unimodal biometric systems with single features. Thus, multimodal biometrics is investigated to reduce some of these defects. This paper proposes a new multimodal biometric recognition approach by fused faces and fingerprints. For more recognizable features, the proposed method extracts block local binary pattern features for all modalities, and then combines them into a single framework. For better classification, it employs the robust probabilistic collaborative representation based classifier to recognize individuals. Experimental results indicate that the proposed method has improved the recognition accuracy compared to the unimodal biometrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Point cloud registration is a fundamental task in high level three dimensional applications. Noise, uneven point density and varying point cloud resolutions are the three main challenges for point cloud registration. In this paper, we design a robust and compact local surface descriptor called Local Surface Angles Histogram (LSAH) and propose an effectively coarse to fine algorithm for point cloud registration. The LSAH descriptor is formed by concatenating five normalized sub-histograms into one histogram. The five sub-histograms are created by accumulating a different type of angle from a local surface patch respectively. The experimental results show that our LSAH is more robust to uneven point density and point cloud resolutions than four state-of-the-art local descriptors in terms of feature matching. Moreover, we tested our LSAH based coarse to fine algorithm for point cloud registration. The experimental results demonstrate that our algorithm is robust and efficient as well.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Text Localization and extraction is an important issue in modern applications of computer vision. Applications such as reading and translating texts in the wild or from videos are among the many applications that can benefit results of this field. In this work, we adopt the well-known Viola-Jones algorithm to enable text extraction and localization from images in the wild. The Viola-Jones is an efficient, and a fast image-processing algorithm originally used for face detection. Based on some resemblance between text and face detection tasks in the wild, we have modified the viola-jones to detect regions of interest where text may be localized. In the proposed approach, some modification to the HAAR like features and a semi-automatic process of data set generating and manipulation were presented to train the algorithm. A process of sliding windows with different sizes have been used to scan the image for individual letters and letter clusters existence. A post processing step is used in order to combine the detected letters into words and to remove false positives. The novelty of the presented approach is using the strengths of a modified Viola-Jones algorithm to identify many different objects representing different letters and clusters of similar letters and later combine them into words of varying lengths. Impressive results were obtained on the ICDAR contest data sets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an offline signature verification approach using convolutional Siamese neural network. Unlike the existing methods which consider feature extraction and metric learning as two independent stages, we adopt a deepleaning based framework which combines the two stages together and can be trained end-to-end. The experimental results on two offline public databases (GPDSsynthetic and CEDAR) demonstrate the superiority of our method on the offline signature verification problem.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Compared with the process of embedding, the image contents make a more significant impact on the differences of image statistical characteristics. This makes the image steganalysis to be a classification problem with bigger withinclass scatter distances and smaller between-class scatter distances. As a result, the steganalysis features will be inseparate caused by the differences of image statistical characteristics. In this paper, a new steganalysis framework which can reduce the differences of image statistical characteristics caused by various content and processing methods is proposed. The given images are segmented to several sub-images according to the texture complexity. Steganalysis features are separately extracted from each subset with the same or close texture complexity to build a classifier. The final steganalysis result is figured out through a weighted fusing process. The theoretical analysis and experimental results can demonstrate the validity of the framework.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Block truncation coding (BTC) is a fast image compression technique applied in spatial domain. Traditional BTC and its variants mainly focus on reducing computational complexity for low bit rate compression, at the cost of lower quality of decoded images, especially for images with rich texture. To solve this problem, in this paper, a quadtree-based block truncation coding algorithm combined with adaptive bit plane transmission is proposed. First, the direction of edge in each block is detected using Sobel operator. For the block with minimal size, adaptive bit plane is utilized to optimize the BTC, which depends on its MSE loss encoded by absolute moment block truncation coding (AMBTC). Extensive experimental results show that our method gains 0.85 dB PSNR on average compare to some other state-of-the-art BTC variants. So it is desirable for real time image compression applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Reading text or searching for key words within a historical document is a very challenging task. one of the first steps of the complete task is binarization, where we separate foreground such as text, figures and drawings from the background. Successful results of this important step in many cases can determine next steps to success or failure, therefore it is very vital to the success of the complete task of reading and analyzing the content of a document image. Generally, historical documents images are of poor quality due to their storage condition and degradation over time, which mostly cause to varying contrasts, stains, dirt and seeping ink from reverse side. In this paper, we use banks of anisotropic predefined filters in different scales and orientations to develop a binarization method for degraded documents and manuscripts. Using the fact, that handwritten strokes may follow different scales and orientations, we use predefined sets of filter banks having various scales, weights, and orientations to seek a compact set of filters and weights in order to generate different layers of foregrounds and background. Results of convolving these filters on the gray level image locally, weighted and accumulated to enhance the original image. Based on the different layers, seeds of components in the gray level image and a learning process, we present an improved binarization algorithm to separate the background from layers of foreground. Different layers of foreground which may be caused by seeping ink, degradation or other factors are also separated from the real foreground in a second phase. Promising experimental results were obtained on the DIBCO2011 , DIBCO2013 and H-DIBCO2016 data sets and a collection of images taken from real historical documents.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Target tracking is an important field of computer vision. The template matching tracking algorithm based on squared difference matching (SSD) and standard correlation coefficient (NCC) matching is very sensitive to the gray change of image. When the brightness or gray change, the tracking algorithm will be affected by high-frequency information. Tracking accuracy is reduced, resulting in loss of tracking target. In this paper, a differential tracking algorithm based on discrete sine transform is proposed to reduce the influence of image gray or brightness change. The algorithm that combines the discrete sine transform and the difference algorithm maps the target image into a image digital sequence. The Kalman filter predicts the target position. Using the Hamming distance determines the degree of similarity between the target and the template. The window closest to the template is determined the target to be tracked. The target to be tracked updates the template. Based on the above achieve target tracking. The algorithm is tested in this paper. Compared with SSD and NCC template matching algorithms, the algorithm tracks target stably when image gray or brightness change. And the tracking speed can meet the read-time requirement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an approach to estimating point spread function (PSF) from low resolution (LR) images. Existing techniques usually rely on accurate detection of ending points of the profile normal to edges. In practice however, it is often a great challenge to accurately localize profiles of edges from a LR image, which hence leads to a poor PSF estimation of the lens taking the LR image. For precisely estimating the PSF, this paper proposes firstly estimating a 1-D PSF kernel with straight lines, and then robustly obtaining the 2-D PSF from the 1-D kernel by least squares techniques and random sample consensus. Canny operator is applied to the LR image for obtaining edges and then Hough transform is utilized to extract straight lines of all orientations. Estimating 1-D PSF kernel with straight lines effectively alleviates the influence of the inaccurate edge detection on PSF estimation. The proposed method is investigated on both natural and synthetic images for estimating PSF. Experimental results show that the proposed method outperforms the state-ofthe- art and does not rely on accurate edge detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
TV Video caption image binarization has important influence on semantic video retrieval. An improved binarization method for caption image is proposed in this paper. In order to overcome the shortcomings of ghost and broken strokes problems of traditional Niblack method, the method has considered the global information of the images and the local information of the images. First, Tradition Otsu and Niblack thresholds are used for initial binarization. Second, we introduced the difference between maximum and minimum values in the local window as a third threshold to generate two images. Finally, with a logic AND operation of the two images, great results were obtained. The experiment results prove that the proposed method is reliable and effective.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The discrete wavelet transform can be found at the heart of many image-processing algorithms. Until now, the transform on general-purpose processors (CPUs) was mostly computed using a separable lifting scheme. As the lifting scheme consists of a small number of operations, it is preferred for processing using single-core CPUs. However, considering a parallel processing using multi-core processors, this scheme is inappropriate due to a large number of steps. On such architectures, the number of steps corresponds to the number of points that represent the exchange of data. Consequently, these points often form a performance bottleneck. Our approach appropriately rearranges calculations inside the transform, and thereby reduces the number of steps. In other words, we propose a new scheme that is friendly to parallel environments. When evaluating on multi-core CPUs, we consistently overcome the original lifting scheme. The evaluation was performed on 61-core Intel Xeon Phi and 8-core Intel Xeon processors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the problem that the traditional point set non-rigid registration algorithm has low precision and slow convergence speed for complex local deformation data, this paper proposes a robust non-rigid registration algorithm based on local affine registration. The algorithm uses a hierarchical iterative method to complete the point set non-rigid registration from coarse to fine. In each iteration, the sub data point sets and sub model point sets are divided and the shape control points of each sub point set are updated. Then we use the control point guided affine ICP algorithm to solve the local affine transformation between the corresponding sub point sets. Next, the local affine transformation obtained by the previous step is used to update the sub data point sets and their shape control point sets. When the algorithm reaches the maximum iteration layer K, the loop ends and outputs the updated sub data point sets. Experimental results demonstrate that the accuracy and convergence of our algorithm are greatly improved compared with the traditional point set non-rigid registration algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image retargeting technique requires important information preservation and less edge distortion during increasing/decreasing image size. The major existed content-aware methods perform well. However, there are two problems should be improved: the slight distortion appeared at the object edges and the structure distortion in the nonsalient area. According to psychological theories, people evaluate image quality based on multi-level judgments and comparison between different areas, both image content and image structure. The paper proposes a new standard: the structure preserving in non-salient area. After observation and image analysis, blur (slight blur) is generally existed at the edge of objects. The blur feature is used to estimate the depth cue, named blur depth descriptor. It can be used in the process of saliency computation for balanced image retargeting result. In order to keep the structure information in nonsalient area, the salient edge map is presented in Seam Carving process, instead of field-based saliency computation. The derivative saliency from x- and y-direction can avoid the redundant energy seam around salient objects causing structure distortion. After the comparison experiments between classical approaches and ours, the feasibility of our algorithm is proved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The traditional image matching algorithm always can not balance the real-time and accuracy better, to solve the problem, an adaptive clustering algorithm for image matching based on corner feature is proposed in this paper. The method is based on the similarity of the matching pairs of vector pairs, and the adaptive clustering is performed on the matching point pairs. Harris corner detection is carried out first, the feature points of the reference image and the perceived image are extracted, and the feature points of the two images are first matched by Normalized Cross Correlation (NCC) function. Then, using the improved algorithm proposed in this paper, the matching results are clustered to reduce the ineffective operation and improve the matching speed and robustness. Finally, the Random Sample Consensus (RANSAC) algorithm is used to match the matching points after clustering.
The experimental results show that the proposed algorithm can effectively eliminate the most wrong matching points while the correct matching points are retained, and improve the accuracy of RANSAC matching, reduce the computation load of whole matching process at the same time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To study the image matching algorithm, algorithm four elements are described, i.e., similarity measurement, feature space, search space and search strategy. Four common indexes for evaluating the image matching algorithm are described, i.e., matching accuracy, matching efficiency, robustness and universality. Meanwhile, this paper describes the principle of image matching algorithm based on the gray value, image matching algorithm based on the feature, image matching algorithm based on the frequency domain analysis, image matching algorithm based on the neural network and image matching algorithm based on the semantic recognition, and analyzes their characteristics and latest research achievements. Finally, the development trend of image matching algorithm is discussed. This study is significant for the algorithm improvement, new algorithm design and algorithm selection in practice.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present Augmented Lazy Snapping, an interactive image segmentation algorithm. We investigate the presegmentation step in the Lazy Snapping method and find the method can be improved by changing the middle layer of segmentation pipeline. To be more specific, we try different segmentation algorithms instead of watershed algorithm to find the optimal one for the pre-segmentation step. Augmented Lazy Snapping method can provide better segmented results with the fewer over-segmented regions. Moreover, it improves the efficiency of graph cut solution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a robust and fast active contour model is proposed for image segmentation in the presence of intensity inhomogeneity. By introducing the local image intensities fitting functions before the evolution of curve, the proposed model can effectively segment images with intensity inhomogeneity. And the computation cost is low because the fitting functions do not need to be updated in each iteration. Experiments have shown that the proposed model has a higher segmentation efficiency compared to some well-known active contour models based on local region fitting energy. In addition, the proposed model is robust to initialization, which allows the initial level set function to be a small constant function.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a new method for wood defect detection. It can solve the over-segmentation problem existing in local threshold segmentation methods. This method effectively takes advantages of visual saliency and local threshold segmentation. Firstly, defect areas are coarsely located by using spectral residual method to calculate global visual saliency of them. Then, the threshold segmentation of maximum inter-class variance method is adopted for positioning and segmenting the wood surface defects precisely around the coarse located areas. Lastly, we use mathematical morphology to process the binary images after segmentation, which reduces the noise and small false objects. Experiments on test images of insect hole, dead knot and sound knot show that the method we proposed obtains ideal segmentation results and is superior to the existing segmentation methods based on edge detection, OSTU and threshold segmentation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multi-atlas segmentation is an effective approach and increasingly popular for automatically labeling objects of interest in medical images. Recently, segmentation methods based on generative models and patch-based techniques have become the two principal branches of label fusion. However, these generative models and patch-based techniques are only loosely related, and the requirement for higher accuracy, faster segmentation, and robustness is always a great challenge. In this paper, we propose novel algorithm that combines the two branches using global weighted fusion strategy based on a patch latent selective model to perform segmentation of specific anatomical structures for human brain magnetic resonance (MR) images. In establishing this probabilistic model of label fusion between the target patch and patch dictionary, we explored the Kronecker delta function in the label prior, which is more suitable than other models, and designed a latent selective model as a membership prior to determine from which training patch the intensity and label of the target patch are generated at each spatial location. Because the image background is an equally important factor for segmentation, it is analyzed in label fusion procedure and we regard it as an isolated label to keep the same privilege between the background and the regions of interest. During label fusion with the global weighted fusion scheme, we use Bayesian inference and expectation maximization algorithm to estimate the labels of the target scan to produce the segmentation map. Experimental results indicate that the proposed algorithm is more accurate and robust than the other segmentation methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose an image semantic segmentation model which is trained from image-level labeled images. The proposed model starts with superpixel segmenting, and features of the superpixels are extracted by trained CNN. We introduce a superpixel-based graph followed by applying the graph partition method to group correlated superpixels into clusters. For the acquisition of inter-label correlations between the image-level labels in dataset, we not only utilize label co-occurrence statistics but also exploit visual contextual cues simultaneously. At last, we formulate the task of mapping appropriate image-level labels to the detected clusters as a problem of convex minimization. Experimental results on MSRC-21 dataset and LableMe dataset show that the proposed method has a better performance than most of the weakly supervised methods and is even comparable to fully supervised methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fuzzy c-means clustering (FCM), especially with spatial constraints (FCM_S), is an effective algorithm suitable for image segmentation. Its reliability contributes not only to the presentation of fuzziness for belongingness of every pixel but also to exploitation of spatial contextual information. But these algorithms still remain some problems when processing the image with noise, they are sensitive to the parameters which have to be tuned according to prior knowledge of the noise. In this paper, we propose a new FCM algorithm, combining the gray constraints and spatial constraints, called spatial and gray-level denoised fuzzy c-means (SGDFCM) algorithm. This new algorithm conquers the parameter disadvantages mentioned above by considering the possibility of noise of each pixel, which aims to improve the robustness and obtain more detail information. Furthermore, the possibility of noise can be calculated in advance, which means the algorithm is effective and efficient.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A good edge plot should use continuous thin lines to describe the complete contour of the captured object. However, the detection of weak edges is a challenging task because of the associated low pixel intensities. Ant Colony Optimization (ACO) has been employed by many researchers to address this problem. The algorithm is a meta-heuristic method developed by mimicking the natural behaviour of ants. It uses iterative searches to find the optimal solution that cannot be found via traditional optimization approaches. In this work, ACO is employed to track and repair broken edges obtained via conventional Sobel edge detector to produced a result with more connected edges.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Before detecting cracks and repairs on road lanes, it’s necessary to eliminate the influence of lane lines on the recognition result in road lane images. Aiming at the problems caused by lane lines, an image segmentation algorithm based on adaptive threshold and connected domain is proposed. First, by analyzing features like grey level distribution and the illumination of the images, the algorithm uses Hough transform to divide the images into different sections and convert them into binary images separately. It then uses the connected domain theory to amend the outcome of segmentation, remove noises and fill the interior zone of lane lines. Experiments have proved that this method could eliminate the influence of illumination and lane line abrasion, removing noises thoroughly while maintaining high segmentation precision.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a novel image segmentation method for immunofluorescence microscopy images of skin tissue for the diagnosis of various skin diseases. The segmentation is based on machine learning algorithms. The feature vector is filled by three groups of features: statistical features, Laws’ texture energy measures and local binary patterns. The images are preprocessed for better learning. Different machine learning algorithms have been used and the best results have been obtained with random forest algorithm. We use the proposed method to detect the epidermis region as a part of pemphigus diagnosis system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Semantic scene parsing is considerable in many intelligent field, including perceptual robotics. For the past few years, pixel-wise prediction tasks like semantic segmentation with RGB images has been extensively studied and has reached very remarkable parsing levels, thanks to convolutional neural networks (CNNs) and large scene datasets. With the development of stereo cameras and RGBD sensors, it is expected that additional depth information will help improving accuracy. In this paper, we propose a semantic segmentation framework incorporating RGB and complementary depth information. Motivated by the success of fully convolutional networks (FCN) in semantic segmentation field, we design a fully convolutional networks consists of two branches which extract features from both RGB and depth data simultaneously and fuse them as the network goes deeper. Instead of aggregating multiple model, our goal is to utilize RGB data and depth data more effectively in a single model. We evaluate our approach on the NYU-Depth V2 dataset, which consists of 1449 cluttered indoor scenes, and achieve competitive results with the state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a novel fast and robust image segmentation method based on superpixels (FRISS). In order to make the algorithm adaptive as well as efficient, we first compute superpixels of the image with modified SLIC. Moreover, a modified SimHash is encoded for each superpixels. In addition, similar superpixels are associated together according to the similarity measure gotten from the Hamming distance of SimHash. FRISS can segment image with the given threshold of the similarity, which demonstrates its’ adaptation. On the other hand, the similarity is computed by the Hamming distance of SimHash code which is much faster than other similarities. From the experimental results, we can know that FRISS is fast and efficient.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image segmentation is a significant step in image analysis and machine vision. Many approaches have been presented in this topic; among them, fuzzy C-means (FCM) clustering is one of the most widely used methods for its high efficiency and ambiguity of images. However, the success of FCM could not be guaranteed because it easily traps into local optimal solution. Cuckoo search (CS) is a novel evolutionary algorithm, which has been tested on some optimization problems and proved to be high-efficiency. Therefore, a new segmentation technique using FCM and blending of CS algorithm is put forward in the paper. Further, the proposed method has been measured on several images and compared with other existing FCM techniques such as genetic algorithm (GA) based FCM and particle swarm optimization (PSO) based FCM in terms of fitness value. Experimental results indicate that the proposed method is robust, adaptive and exhibits the better performance than other methods involved in the paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video target tracking technology under the unremitting exploration of predecessors has made big progress, but there are still lots of problems not solved. This paper proposed a new algorithm of target tracking based on image segmentation technology. Firstly we divide the selected region using simple linear iterative clustering (SLIC) algorithm, after that, we block the area with the improved density-based spatial clustering of applications with noise (DBSCAN) clustering algorithm. Each sub-block independently trained classifier and tracked, then the algorithm ignore the failed tracking sub-block while reintegrate the rest of the sub-blocks into tracking box to complete the target tracking. The experimental results show that our algorithm can work effectively under occlusion interference, rotation change, scale change and many other problems in target tracking compared with the current mainstream algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the proliferation of the user-generated videos, temporal segmentation is becoming a challengeable problem. Traditional video temporal segmentation methods like shot detection are not able to work on unedited user-generated videos, since they often only contain one single long shot. We propose a novel temporal segmentation framework for user-generated video. It finds similar frames with a tree partitioning min-Hash technique, constructs sparse temporal constrained affinity sub-graphs, and finally divides the video into sub-shot-level segments with a dense-neighbor-based clustering method. Experimental results show that our approach outperforms all the other related works. Furthermore, it is indicated that the proposed approach is able to segment user-generated videos at an average human level.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video foreground segmentation is one of the key problems in video processing. In this paper, we proposed a novel and fully unsupervised approach for foreground object co-localization and segmentation of unconstrained videos. We firstly compute both the actual edges and motion boundaries of the video frames, and then align them by their HOG feature maps. Then, by filling the occlusions generated by the aligned edges, we obtained more precise masks about the foreground object. Such motion-based masks could be derived as the motion-based likelihood. Moreover, the color-base likelihood is adopted for the segmentation process. Experimental Results show that our approach outperforms most of the State-of-the-art algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Level set model has advantages in handling complex shapes and topological changes, and is widely used in image processing tasks. The image segmentation oriented level set models can be grouped into region-based models and edge-based models, both of which have merits and drawbacks. Region-based level set model relies on fitting to color intensity of separated regions, but is not sensitive to edge information. Edge-based level set model evolves by fitting to local gradient information, but can get easily affected by noise. We propose a region-edge based level set model, which considers saliency information into energy function and fuses color intensity with local gradient information. The evolution of the proposed model is implemented by a hierarchical two-stage protocol, and the experimental results show flexible initialization, robust evolution and precise segmentation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Low-rank representation (LRR) has been shown successful in seeking low-rank structures of data relationships in a union of subspaces. Generally, LRR and LRR-based variants need to solve the nuclear norm-based minimization problems. Beyond the success of such methods, it has been widely noted that the nuclear norm may not be a good rank approximation because it simply adds all singular values of a matrix together and thus large singular values may dominant the weight. This results in far from satisfactory rank approximation and may degrade the performance of lowrank models based on the nuclear norm. In this paper, we propose a novel nonconvex rank approximation based on the Gaussian distribution function, which has demanding properties to be a better rank approximation than the nuclear norm. Then a low-rank model is proposed based on the new rank approximation with application to motion segmentation. Experimental results have shown significant improvements and verified the effectiveness of our method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
FCN, trained end-to-end, pixels-to-pixels, predict result of each pixel. It has been widely used for semantic segmentation. In order to realize the blood vessels segmentation of hatching eggs, a method based on FCN is proposed in this paper. The training datasets are composed of patches extracted from very few images to augment data. The network combines with lower layer and deconvolution to enables precise segmentation. The proposed method frees from the problem that training deep networks need large scale samples. Experimental results on hatching eggs demonstrate that this method can yield more accurate segmentation outputs than previous researches. It provides a convenient reference for fertility detection subsequently.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image segmentation is a very important step in the low-level visual computing. Although image segmentation has been studied for many years, there are still many problems. PCNN (Pulse Coupled Neural network) has biological background, when it is applied to image segmentation it can be viewed as a region-based method, but due to the dynamics properties of PCNN, many connectionless neurons will pulse at the same time, so it is necessary to identify different regions for further processing. The existing PCNN image segmentation algorithm based on region growing is used for grayscale image segmentation, cannot be directly used for color image segmentation. In addition, the super-pixel can better reserve the edges of images, and reduce the influences resulted from the individual difference between the pixels on image segmentation at the same time. Therefore, on the basis of the super-pixel, the original PCNN algorithm based on region growing is improved by this paper. First, the color super-pixel image was transformed into grayscale super-pixel image which was used to seek seeds among the neurons that hadn’t been fired. And then it determined whether to stop growing by comparing the average of each color channel of all the pixels in the corresponding regions of the color super-pixel image. Experiment results show that the proposed algorithm for the color image segmentation is fast and effective, and has a certain effect and accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Characteristics of restoration image such as smoothness, edge and texture, can be better maintained using non-local differential operator. In this paper, we present a nonlocal multichannel total variational (MTV) model for Retinex theory, which can be solved by a fast computational approach based on the alternating direction method of multipliers (ADMM). Experiential results show that our nonlocal MTV method has a good performance on contrast enhancement, non-uniform illumination elimination, noise suppression, and especially for texture preserving. Furthermore, several variational Retinex method are compared to prove that our proposed method achieves more accurate and fewer iterations for recovering the reflectance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, nonlocal regularization methods for image restoration (IR) have drawn more and more attention due to the promising results obtained when compared to the traditional local regularization methods. Despite the success of this technique, in order to obtain computational efficiency, a convex regularizing functional is exploited in most existing methods, which is equivalent to imposing a convex prior on the nonlocal difference operator output. However, our conducted experiment illustrates that the empirical distribution of the output of the nonlocal difference operator especially in the seminal work of Kheradmand et al. should be characterized with an extremely heavy-tailed distribution rather than a convex distribution. Therefore, in this paper, we propose a nonlocal regularization-based method with a non-convex sparsity constraint for image deblurring. Finally, an effective algorithm is developed to solve the corresponding non-convex optimization problem. The experimental results demonstrate the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a multi-frame background suppression method for remote infrared small target detection. Inter-frame information is necessary when the heavy background clutters make it difficult to distinguish real targets and false alarms. A registration procedure based on points matching in image patches is used to compensate the local deformation of background. Then the target can be separated by background subtraction. Experiments show our method serves as an effective preliminary of target detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Images are valuable information sources for many scientific and engineering applications. However, images captured in poor illumination conditions would have a large portion of dark regions that could heavily degrade the image quality. In order to improve the quality of such images, a restoration algorithm is developed here that transforms the low input brightness to a higher value using a modified Multi-Scale Retinex approach. The algorithm is further improved by a entropy based weighting with the input and the processed results to refine the necessary amplification at regions of low brightness. Moreover, fine details in the image are preserved by applying the Retinex principles to extract and then re-insert object edges to obtain an enhanced image. Results from experiments using low and normal illumination images have shown satisfactory performances with regard to the improvement in information contents and the mitigation of viewing artifacts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a method based on non-local saturation algorithm is proposed to avoid block and halo effect for single image dehazing with dark channel prior. First we convert original image from RGB color space into HSV color space with the idea of non-local method. Image saturation is weighted equally by the size of fixed window according to image resolution. Second we utilize the saturation to estimate the atmospheric light value and transmission rate. Then through the function of saturation and transmission, the haze-free image is obtained based on the atmospheric scattering model. Comparing the results of existing methods, our method can restore image color and enhance contrast. We guarantee the proposed method with quantitative and qualitative evaluation respectively. Experiments show the better visual effect with high efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the long history of human civilization, archived film is an indispensable part of it, and using digital method to repair damaged film is also a mainstream trend nowadays. In this paper, we propose a sparse color correspondences based technique to remove fading flicker for old films. Our model, combined with multi frame images to establish a simple correction model, includes three key steps. Firstly, we recover sparse color correspondences in the input frames to build a matrix with many missing entries. Secondly, we present a low-rank matrix factorization approach to estimate the unknown parameters of this model. Finally, we adopt a two-step strategy that divide the estimated parameters into reference frame parameters for color recovery correction and other frame parameters for color consistency correction to remove flicker. Our method combined multi-frames takes continuity of the input sequence into account, and the experimental results show the method can remove fading flicker efficiently.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In exposure fusion, it is challenging to remove artifacts because of camera motion and moving objects in the scene. An improved artifact removal method is proposed in this paper, which performs local linear adjustment in artifact removal progress. After determining a reference image, we first perform high-dynamic-range (HDR) deghosting to generate an intermediate image stack from the input image stack. Then, a linear Intensity Mapping Function (IMF) in each window is extracted based on the intensities of intermediate image and reference image, the intensity mean and variance of reference image. Finally, with the extracted local linear constraints, we reconstruct a target image stack, which can be directly used for fusing a single HDR-like image. Some experiments have been implemented and experimental results demonstrate that the proposed method is robust and effective in removing artifacts especially in the saturated regions of the reference image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image deblurring under impulse noise is a typical ill-posed problem which requires regularization methods to guarantee high-quality imaging. L1-norm data-fidelity term and total variation (TV) regularizer have been combined to contribute the popular regularization method. However, the TV-regularized variational image deblurring model often suffers from the staircase-like artifacts leading to image quality degradation. To enhance image quality, the detailpreserving total generalized variation (TGV) was introduced to replace TV to eliminate the undesirable artifacts. The resulting nonconvex optimization problem was effectively solved using the alternating direction method of multipliers (ADMM). In addition, an automatic method for selecting spatially adapted regularization parameters was proposed to further improve deblurring performance. Our proposed image deblurring framework is able to remove blurring and impulse noise effects while maintaining the image edge details. Comprehensive experiments have been conducted to demonstrate the superior performance of our proposed method over several state-of-the-art image deblurring methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Imaging quality is often significantly degraded under hazy weather condition. The purpose of this paper is to recover the latent sharp image from its hazy version. It is well known that the accurate estimation of depth information could assist in improving dehazing performance. In this paper, a detail-preserving variational model was proposed to simultaneously estimate haze-free image and depth map. In particular, the total variation (TV) and total generalized variation (TGV) regularizers were introduced to restrain haze-free image and depth map, respectively. The resulting nonsmooth optimization problem was efficiently solved using the alternating direction method of multipliers (ADMM). Comprehensive experiments have been conducted on realistic datasets to compare our proposed method with several state-of-the-art dehazing methods. Results have illustrated the superior performance of the proposed method in terms of visual quality evaluation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Electron microscope image stitching is highly desired to acquire microscopic resolution images of large target scenes in neuroscience. However, the result of multiple Mosaicked electron microscope images may exist severe gray scale inhomogeneity due to the instability of the electron microscope system and registration errors, which degrade the visual effect of the mosaicked EM images and aggravate the difficulty of follow-up treatment, such as automatic object recognition. Consequently, the grayscale correction method for multiple mosaicked electron microscope images is indispensable in these areas. Different from most previous grayscale correction methods, this paper designs a grayscale correction process for multiple EM images which tackles the difficulty of the multiple images monochrome correction and achieves the consistency of grayscale in the overlap regions. We adjust overall grayscale of the mosaicked images with the location and grayscale information of manual selected seed images, and then fuse local overlap regions between adjacent images using Poisson image editing. Experimental result demonstrates the effectiveness of our proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Rendering technology has been widely used in the home decoration industry in recent years for images of home decoration design. However, due to the fact that rendered images of home decoration design rely heavily on the parameters of renderer and the lights of scenes, most rendered images in this industry require further optimization afterwards. To reduce workload and enhance rendered images automatically, an algorithm utilizing neural networks is proposed in this manuscript. In addition, considering few extreme conditions such as strong sunlight and lights, SLIC superpixels based segmentation is used to choose out these bright areas of an image and enhance them independently. Finally, these chosen areas are merged with the entire image. Experimental results show that the proposed method effectively enhances the rendered images when compared with some existing algorithms. Besides, the proposed strategy is proven to be adaptable especially to those images with obvious bright parts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Infrared image enhancement is an important and necessary task in the infrared imaging system. In this paper, by defining the contrast in terms of the area between adjacent non-zero histogram, a novel analytical model is proposed to enlarge the areas so that the contrast can be increased. In addition, the analytical model is regularized by a penalty term based on the saliency value to enhance the salient regions as well. Thus, both of the whole images and salient regions can be enhanced, and the rank consistency can be preserved. The comparisons on 8-bit images show that the proposed method can enhance the infrared images with more details.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a novel enhancement method based exclusively on the bilinear interpolation algorithm for capsule endoscopy images. The proposed method does not convert the original RBG image components to HSV or any other color space or model; instead, it processes directly RGB components. In each component, a group of four adjacent pixels and half-unit weight in the bilinear weighting function are used to calculate the average pixel value, identical for each pixel in that particular group. After calculations, groups of identical pixels are overlapped successively in horizontal and vertical directions to achieve a preliminary-enhanced image. The final-enhanced image is achieved by halving the sum of the original and preliminary-enhanced image pixels. Quantitative and qualitative experiments were conducted focusing on pairwise comparisons between original and enhanced images. Final-enhanced images have generally the best diagnostic quality and gave more details about the visibility of vessels and structures in capsule endoscopy images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Restoring images captured under low-illuminations is an essential front-end process for most image based applications. The Center-Surround Retinex algorithm has been a popular approach employed to improve image brightness. However, this algorithm in its basic form, is known to produce color degradations. In order to mitigate this problem, here the Single-Scale Retinex algorithm is modified as an edge extractor while illumination is recovered through a non-linear intensity mapping stage. The derived edges are then integrated with the mapped image to produce the enhanced output. Furthermore, in reducing color distortion, the process is conducted in the magnitude sorted domain instead of the conventional Red-Green-Blue (RGB) color channels. Experimental results had shown that improvements with regard to mean brightness, colorfulness, saturation, and information content can be obtained.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image enhancement is an imperative step for many vision based applications. For image contrast enhancement, popular methods adopt the principle of spreading the captured intensities throughout the allowed dynamic range according to predefined distributions. However, these algorithms take little or no consideration into account of maintaining the mean brightness of the original scene, which is of paramount importance to carry the true scene illumination characteristics to the viewer. Though there have been significant amount of reviews on contrast enhancement methods published, updated review on overall brightness preserving image enhancement methods is still scarce. In this paper, a detailed survey is performed on those particular methods that specifically aims to maintain the overall scene illumination characteristics while enhancing the digital image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Enhancement of low illumination images is of great importance in poor imaging conditions. A new image enhancement model was proposed in this paper. The model divided an image into blocks and used the local standard deviation to design the center/surround filter and utilized amplitude compensation factor to compensate the shortage of logarithmic function in compressing the near-zero data’s amplitude. In addition, the amplitude compensation factor can suppress noise. At the same time, the normalized brightness can maintain the normal brightness region of the image while the brightness of the image is increased. In order to verify the performance of the proposed model, the proposed model and existing models are applied to image enhancement. To evaluate its performance in image enhancement, results are compared from the subjective and objective aspects. The experimental results show that the proposed model preserved the image details better and avoided the excessive enhancement of the normal brightness region.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Contrast enhancement is a technique for enhancing image contrast to obtain better visual quality. Since many existing contrast enhancement algorithms usually produce over-enhanced results, the naturalness preservation is needed to be considered in the framework of image contrast enhancement. This paper proposes a naturalness preservation contrast enhancement method, which adopts the histogram matching to improve the contrast and uses the image quality assessment to automatically select the optimal target histogram. The contrast improvement and the naturalness preservation are both considered in the target histogram, so this method can avoid the over-enhancement problem. In the proposed method, the optimal target histogram is a weighted sum of the original histogram, the uniform histogram, and the Gaussian-shaped histogram. Then the structural metric and the statistical naturalness metric are used to determine the weights of corresponding histograms. At last, the contrast-enhanced image is obtained via matching the optimal target histogram. The experiments demonstrate the proposed method outperforms the compared histogram-based contrast enhancement algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Convolution neural network (CNN) has made great success in image classification tasks. Even in the field of synthetic aperture radar automatic target recognition (SAR-ATR), state-of-art results has been obtained by learning deep representation of features on the MSTAR benchmark. However, the raw data of MSTAR have shortcomings in training a SAR-ATR model because of high similarity in background among the SAR images of each kind. This indicates that the CNN would learn the hierarchies of features of backgrounds as well as the targets. To validate the influence of the background, some other SAR images datasets have been made which contains the simulation SAR images of 10 manufactured targets such as tank and fighter aircraft, and the backgrounds of simulation SAR images are sampled from the whole original MSTAR data. The simulation datasets contain the dataset that the backgrounds of each kind images correspond to the one kind of backgrounds of MSTAR targets or clutters and the dataset that each image shares the random background of whole MSTAR targets or clutters. In addition, mixed datasets of MSTAR and simulation datasets had been made to use in the experiments. The CNN architecture proposed in this paper are trained on all datasets mentioned above. The experimental results shows that the architecture can get high performances on all datasets even the backgrounds of the images are miscellaneous, which indicates the architecture can learn a good representation of the targets even though the drastic changes on background.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a method for traffic sign classification using a convolutional neural network (CNN). In this method, firstly we transfer a color image into grayscale, and then normalize it in the range (-1,1) as the preprocessing step. To increase robustness of classification model, we apply a dataset augmentation algorithm and create new images to train the model. To avoid overfitting, we utilize a dropout module before the last fully connection layer. To assess the performance of the proposed method, the German traffic sign recognition benchmark (GTSRB) dataset is utilized. Experimental results show that the method is effective in classifying traffic signs.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hyperspectral image classification has been well acknowledged as one of the challenging tasks of hyperspectral data processing. In this paper, we propose a novel hyperspectral image classification framework based on local binary pattern (LBP) features and PCANet. In the proposed method, linear prediction error (LPE) is first employed to select a subset of informative bands, and LBP is utilized to extract texture features. Then, spectral and texture features are stacked into a high dimensional vectors. Next, the extracted features of a specified position are transformed to a 2-D image. The obtained images of all pixels are fed into PCANet for classification. Experimental results on real hyperspectral dataset demonstrate the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Nowadays, several datasets are demonstrated by multi-view, which usually include shared and complementary information. Multi-view clustering methods integrate the information of multi-view to obtain better clustering results. Nonnegative matrix factorization has become an essential and popular tool in clustering methods because of its interpretation. However, existing nonnegative matrix factorization based multi-view clustering algorithms do not consider the disagreement between views and neglects the fact that different views will have different contributions to the data distribution. In this paper, we propose a new multi-view clustering method, named adaptive multi-view clustering based on nonnegative matrix factorization and pairwise co-regularization. The proposed algorithm can obtain the parts-based representation of multi-view data by nonnegative matrix factorization. Then, pairwise co-regularization is used to measure the disagreement between views. There is only one parameter to auto learning the weight values according to the contribution of each view to data distribution. Experimental results show that the proposed algorithm outperforms several state-of-the-arts algorithms for multi-view clustering.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we proposed a multi-scale convolutional neural network for hyperspectral image classification task. Firstly, compared with conventional convolution, we utilize multi-scale convolutions, which possess larger respective fields, to extract spectral features of hyperspectral image. We design a deep neural network with a multi-scale convolution layer which contains 3 different convolution kernel sizes. Secondly, to avoid overfitting of deep neural network, dropout is utilized, which randomly sleeps neurons, contributing to improve the classification accuracy a bit. In addition, new skills like ReLU in deep learning is utilized in this paper. We conduct experiments on University of Pavia and Salinas datasets, and obtained better classification accuracy compared with other methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we summarize and compare two different approaches used by the authors, to classify different natural textures. The first approach, which is simple and inexpensive in computing time, uses a data bank image and an expert system able to classify different textures from a number of rules established by discipline specialists. The second method uses the same database and a neural networks approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
GF-2 satellite is the highest spatial resolution Remote Sensing Satellite of the development history of China's satellite. In this study, three traditional fusion methods including Brovey, Gram-Schmidt and Color Normalized (CN) were used to compare with the other new fusion method NNDiffuse, which used the qualitative assessment and quantitative fusion quality index, including information entropy, variance, mean gradient, deviation index, spectral correlation coefficient. Analysis results show that NNDiffuse method presented the optimum in qualitative and quantitative analysis. It had more effective for the follow up of remote sensing information extraction and forest, wetland resources monitoring applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new algorithm was proposed for medical images fusion in this paper, which combined gradient minimization smoothing filter (GMSF) with non-sampled directional filter bank (NSDFB). In order to preserve more detail information, a multi scale edge preserving decomposition framework (MEDF) was used to decompose an image into a base image and a series of detail images. For the fusion of base images, the local Gaussian membership function is applied to construct the fusion weighted factor. For the fusion of detail images, NSDFB was applied to decompose each detail image into multiple directional sub-images that are fused by pulse coupled neural network (PCNN) respectively. The experimental results demonstrate that the proposed algorithm is superior to the compared algorithms in both visual effect and objective assessment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multiresolution-based methods, such as wavelet and Contourlet are usually used to image fusion. This work presents a new image fusion frame-work by utilizing area-based standard deviation in dual tree Contourlet trans-form domain. Firstly, the pre-registered source images are decomposed with dual tree Contourlet transform; low-pass and high-pass coefficients are obtained. Then, the low-pass bands are fused with weighted average based on area standard deviation rather than the simple “averaging” rule. While the high-pass bands are merged with the “max-absolute’ fusion rule. Finally, the modified low-pass and high-pass coefficients are used to reconstruct the final fused image. The major advantage of the proposed fusion method over conventional fusion is the approximately shift invariance and multidirectional selectivity of dual tree Contourlet transform. The proposed method is compared with wavelet- , Contourletbased methods and other the state-of-the art methods on common used multi focus images. Experiments demonstrate that the proposed fusion framework is feasible and effective, and it performs better in both subjective and objective evaluation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Bandelet transform could acquire geometric regular direction and geometric flow, sparse representation could represent signals with as little as possible atoms on over-complete dictionary, both of which could be used to image fusion. Therefore, a new fusion method is proposed based on Bandelet and Sparse Representation, to fuse Bandelet coefficients of multi-source images and obtain high quality fusion effects. The test are performed on remote sensing images and simulated multi-focus images, experimental results show that the performance of new method is better than tested methods according to objective evaluation indexes and subjective visual effects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Dictionary learning is the key process of sparse representation which is one of the most widely used image representation theories in image fusion. The existing dictionary learning method does not use the group structure information and the sparse coefficients well. In this paper, we propose a new adaptive structured dictionary learning algorithm and a ℓ1-norm maximum fusion rule that innovatively utilizes grouped sparse coefficients to merge the images. In the dictionary learning algorithm, we do not need prior knowledge about any group structure of the dictionary. By using the characteristics of the dictionary in expressing the signal, our algorithm can automatically find the desired potential structure information that hidden in the dictionary. The fusion rule takes the physical meaning of the group structure dictionary, and makes activity-level judgement on the structure information when the images are being merged. Therefore, the fused image can retain more significant information. Comparisons have been made with several state-of-the-art dictionary learning methods and fusion rules. The experimental results demonstrate that, the dictionary learning algorithm and the fusion rule both outperform others in terms of several objective evaluation metrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Light field cameras have drawn much attention due to the advantage of post-capture adjustments such as refocusing after exposure. The depth of field in refocused images is always shallow because of the large equivalent aperture. As a result, a large number of multi-focus images are obtained and an all-in-focus image is demanded. Consider that most multi-focus image fusion algorithms do not particularly aim at large numbers of source images and traditional DWT-based fusion approach has serious problems in dealing with lots of multi-focus images, causing color distortion and ringing effect. To solve this problem, this paper proposes an efficient multi-focus image fusion method based on stationary wavelet transform (SWT), which can deal with a large quantity of multi-focus images with shallow depth of fields. We compare SWT-based approach with DWT-based approach on various occasions. And the results demonstrate that the proposed method performs much better both visually and quantitatively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Based on the analysis of Laplacian image fusion algorithm, this paper proposes a partial pipelining and modular processing architecture, and a SoC based acceleration system is implemented accordingly. Full pipelining method is used for the design of each module, and modules in series form the partial pipelining with unified data formation, which is easy for management and reuse. Integrated with ARM processor, DMA and embedded bare-mental program, this system achieves 4 layers of Laplacian pyramid on the Zynq-7000 board. Experiments show that, with small resources consumption, a couple of 256×256 images can be fused within 1ms, maintaining a fine fusion effect at the same time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Optic nerve head also called optic disc is the distal portion of optic nerve locating and clinically visible on the retinal surface. It is a 3 dimensional elliptical shaped structure with a central depression called the optic cup. This shape of the ONH and the size of the depression can be varied due to different retinopathy or angiopathy, therefore the estimation of topography of optic nerve head is significant for assisting diagnosis of those retinal related complications. This work describes a computer vision based method, i.e. shape from shading (SFS) to recover and visualize 3D topographic map of optic nerve head from a normal fundus image. The work is expected helpful for assessing those complications associated the deformation of optic nerve head such as glaucoma and diabetes. The illumination is modelled as uniform over the area around optic nerve head and its direction estimated from the available image. The Tsai discrete method has been employed to recover the 3D topographic map of the optic nerve head. The initial experimental result demonstrates our approach works on most of fundus images and provides a cheap, but good alternation for rendering and visualizing the topographic information of the optic nerve head for potential clinical use.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a method for accurate 3D reconstruction based on Photometric Stereo. Instead of applying the global least square solution on the entire over-determined system, we randomly sample the images to form a set of overlapping groups and recover the surface normal for each group using the least square method. We then employ fourdimensional Tensor Robust Principal Component Analysis (TenRPCA) to obtain the accurate 3D reconstruction. Our method outperforms global least square in handling sparse noises such as shadows and specular highlights. Experiments demonstrate the reconstruction accuracy of our approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the improvement of 3D reconstruction theory and the rapid development of computer hardware technology, the reconstructed 3D models are enlarging in scale and increasing in complexity. Models with tens of thousands of 3D points or triangular meshes are common in practical applications. Due to storage and computing power limitation, it is difficult to achieve real-time display and interaction with large scale 3D models for some common 3D display software, such as MeshLab. In this paper, we propose a display system for large-scale 3D scene models. We construct the LOD (Levels of Detail) model of the reconstructed 3D scene in advance, and then use an out-of-core view-dependent multi-resolution rendering scheme to realize the real-time display of the large-scale 3D model. With the proposed method, our display system is able to render in real time while roaming in the reconstructed scene and 3D camera poses can also be displayed. Furthermore, the memory consumption can be significantly decreased via internal and external memory exchange mechanism, so that it is possible to display a large scale reconstructed scene with over millions of 3D points or triangular meshes in a regular PC with only 4GB RAM.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Due to the uncertainty of stratospheric airship's shape and the security problem caused by the uncertainty, surface reconstruction and surface deformation monitoring of airship was conducted based on laser scanning technology and a √3-subdivision scheme based on Shepard interpolation was developed. Then, comparison was conducted between our subdivision scheme and the original √3-subdivision scheme. The result shows our subdivision scheme could reduce the shrinkage of surface and the number of narrow triangles. In addition, our subdivision scheme could keep the sharp features. So, surface reconstruction and surface deformation monitoring of airship could be conducted precisely by our subdivision scheme.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposed a method of reconstructing three-dimensional (3D) scene from two light field images capture by Lytro illium. The work was carried out by first extracting the sub-aperture images from light field images and using the scale-invariant feature transform (SIFT) for feature registration on the selected sub-aperture images. Structure from motion (SFM) algorithm is further used on the registration completed sub-aperture images to reconstruct the three-dimensional scene. 3D sparse point cloud was obtained in the end. The method shows that the 3D reconstruction can be implemented by only two light field camera captures, rather than at least a dozen times captures by traditional cameras. This can effectively solve the time-consuming, laborious issues for 3D reconstruction based on traditional digital cameras, to achieve a more rapid, convenient and accurate reconstruction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a blockwise accelerated proximal gradient (BAPG) approach. It chooses a block diagonal Lipschitz matrix in the generalized APG algorithm, such that the subproblems can be solved either by fast Fourier transform (FFT) or in closed forms. Experiments verify the great speed advantage of BAPG for total variation-based image restoration.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to remove the noise of three-dimensional scattered point cloud and smooth the data without damnify the sharp geometric feature simultaneity, a novel algorithm is proposed in this paper. The feature-preserving weight is added to fuzzy c-means algorithm which invented a curvature weighted fuzzy c-means clustering algorithm. Firstly, the large-scale outliers are removed by the statistics of r radius neighboring points. Then, the algorithm estimates the curvature of the point cloud data by using conicoid parabolic fitting method and calculates the curvature feature value. Finally, the proposed clustering algorithm is adapted to calculate the weighted cluster centers. The cluster centers are regarded as the new points. The experimental results show that this approach is efficient to different scale and intensities of noise in point cloud with a high precision, and perform a feature-preserving nature at the same time. Also it is robust enough to different noise model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we exploit prior information from global positioning systems and inertial measurement units to speed up the process of large scene reconstruction from images acquired by Unmanned Aerial Vehicles. We utilize weak pose information and intrinsic parameter to obtain the projection matrix for each view. As compared to unmanned aerial vehicles' flight altitude, topographic relief can usually be ignored, we assume that the scene is flat and use weak perspective camera to get projective transformations between two views. Furthermore, we propose an overlap criterion and select potentially matching view pairs between projective transformed views. A robust global structure from motion method is used for image based reconstruction. Our real world experiments show that the approach is accurate, scalable and computationally efficient. Moreover, projective transformations between views can also be used to eliminate false matching.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The real-time processing based on embedded system will enhance the application capability of stereo imaging for LiDAR and hyperspectral sensor. The task partitioning and scheduling strategies for embedded multiprocessor system starts relatively late, compared with that for PC computer. In this paper, aimed at embedded multi-core processing platform, a parallel model for stereo imaging is studied and verified. After analyzing the computing amount, throughout capacity and buffering requirements, a two-stage pipeline parallel model based on message transmission is established. This model can be applied to fast stereo imaging for airborne sensors with various characteristics. To demonstrate the feasibility and effectiveness of the parallel model, a parallel software was designed using test flight data, based on the 8-core DSP processor TMS320C6678. The results indicate that the design performed well in workload distribution and had a speed-up ratio up to 6.4.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Quick-response (QR) code technique is combined with ghost imaging (GI) to recover original information with high quality. An image is first transformed into a QR code. Then the QR code is treated as an input image in the input plane of a ghost imaging setup. After measurements, traditional correlation algorithm of ghost imaging is utilized to reconstruct an image (QR code form) with low quality. With this low-quality image as an initial guess, a Gerchberg-Saxton-like algorithm is used to improve its contrast, which is actually a post processing. Taking advantage of high error correction capability of QR code, original information can be recovered with high quality. Compared to the previous method, our method can obtain a high-quality image with comparatively fewer measurements, which means that the time-consuming postprocessing procedure can be avoided to some extent. In addition, for conventional ghost imaging, the larger the image size is, the more measurements are needed. However, for our method, images with different sizes can be converted into QR code with the same small size by using a QR generator. Hence, for the larger-size images, the time required to recover original information with high quality will be dramatically reduced. Our method makes it easy to recover a color image in a ghost imaging setup, because it is not necessary to divide the color image into three channels and respectively recover them.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
360 degree around view monitoring system is the key technology of the advanced driver assistance system, which is used to assist the driver to clear the blind area, and has high application value. In this paper, we study the transformation relationship between multi coordinate system to generate panoramic image in the unified car coordinate system. Firstly, the panoramic image is divided into four regions. By using the parameters obtained by calibration, four fisheye images pixel corresponding to the four sub regions are mapped to the constructed panoramic image. On the basis of 2D around view monitoring system, 3D version is realized by reconstructing the projection surface. Then, we compare 2D around view scheme and 3D around view scheme in unified coordinate system, 3D around view scheme solves the shortcomings of the traditional 2D scheme, such as small visual field, prominent ground object deformation and so on. Finally, the image collected by a fisheye camera installed around the car body can be spliced into a 360 degree panoramic image. So it has very high application value.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to obtain the full view image, proposing a fast method of generating full view images based on camera. the method based on multi-camera equipment and the spherical transform of the camera image acquisition using sphere transformation model. the improved SIFT algorithm is used to extract keypoints and match keypoints, and then calculating the transformation matrix based on the keypoints that filtered. finally, according to the projection matrix and transform matrix of each image, the images of subsequent cameras are spliced together. Using this method to generate full view images. the speed is fast and real-time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aerial sensors are widely used to acquire imagery for photogrammetric and remote sensing application. In general, the images have large overlapped region, which provide a lot of redundant geometry and radiation information for matching. This paper presents a POS supported dense matching procedure for automatic DSM generation from aerial imagery data. The method uses a coarse-to-fine hierarchical strategy with an effective combination of several image matching algorithms: image radiation pre-processing, image pyramid generation, feature point extraction and grid point generation, multi-image geometrically constraint cross-correlation (MIG3C), global relaxation optimization, multi-image geometrically constrained least squares matching (MIGCLSM), TIN generation and point cloud filtering. The image radiation pre-processing is used in order to reduce the effects of the inherent radiometric problems and optimize the images. The presented approach essentially consists of 3 components: feature point extraction and matching procedure, grid point matching procedure and relational matching procedure. The MIGCLSM method is used to achieve potentially sub-pixel accuracy matches and identify some inaccurate and possibly false matches. The feasibility of the method has been tested on different aerial scale images with different landcover types. The accuracy evaluation is based on the comparison between the automatic extracted DSMs derived from the precise exterior orientation parameters (EOPs) and the POS.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Inverse synthetic aperture radar (ISAR) increases the target rotation angle to improve the azimuth resolution. Since the echo data points are fan-shaped in wave number domain, the effective interpolation region is broadened in azimuth direction and compressed in range direction while the target rotation angle increases in the process of Polar Format Algorithm (PFA). This gives rise to the reduction of range wave number bandwidth, which causes the range resolution loss. In this paper, an imaging algorithm based on the varying-parameter method is proposed to eliminate the change of range wave number caused by the rotation angle. This method makes the echo data points wedge distributed in the wave number domain by adjusting radar platform parameters in azimuth direction at each azimuth sampling point. Thus, the compression of range wave number width is effectively reduced. Finally, the simulations are performed to verify that the varying-parameter method can restrain the deterioration of range resolution to improve the quality of image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Range-gated three dimensional imaging technology is a hotspot in recent years, because of the advantages of high spatial resolution, high range accuracy, long range, and simultaneous reflection of target reflectivity information. Based on the study of the principle of intensity-related method, this paper has carried out theoretical analysis and experimental research. The experimental system adopts the high power pulsed semiconductor laser as light source, gated ICCD as the imaging device, can realize the imaging depth and distance flexible adjustment to achieve different work mode. The imaging experiment of small imaging depth is carried out aiming at building 500m away, and 26 group images were obtained with distance step 1.5m. In this paper, the calculation method of 3D point cloud based on triangle method is analyzed, and 15m depth slice of the target 3D point cloud are obtained by using two frame images, the distance precision is better than 0.5m. The influence of signal to noise ratio, illumination uniformity and image brightness on distance accuracy are analyzed. Based on the comparison with the time-slicing method, a method for improving the linearity of point cloud is proposed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Semantic attributes are commonly used for texture description. They can be used to describe the information of a texture, such as patterns, textons, distributions, brightness, and so on. Generally speaking, semantic attributes are more concrete descriptors than perceptual features. Therefore, it is practical to generate texture images from semantic attributes. In this paper, we propose to generate high-quality texture images from semantic attributes. Over the last two decades, several works have been done on texture synthesis and generation. Most of them focusing on example-based texture synthesis and procedural texture generation. Semantic attributes based texture generation still deserves more devotion. Gan et al. proposed a useful joint model for perception driven texture generation. However, perceptual features are nonobjective spatial statistics used by humans to distinguish different textures in pre-attentive situations. To give more describing information about texture appearance, semantic attributes which are more in line with human description habits are desired. In this paper, we use sigmoid cross entropy loss in an auxiliary model to provide enough information for a generator. Consequently, the discriminator is released from the relatively intractable mission of figuring out the joint distribution of condition vectors and samples. To demonstrate the validity of our method, we compare our method to Gan et al.'s method on generating textures by designing experiments on PTD and DTD. All experimental results show that our model can generate textures from semantic attributes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The space object in highly elliptical orbit is always presented as an image point on the ground-based imaging equipment so that it is difficult to resolve and identify the shape and attitude directly. In this paper a novel algorithm is presented for the estimation of spacecraft shape. The apparent magnitude model suitable for the inversion of object information such as shape and attitude is established based on the analysis of photometric characteristics. A parallel adaptive shape inversion algorithm based on UKF was designed after the achievement of dynamic equation of the nonlinear, Gaussian system involved with the influence of various dragging forces. The result of a simulation study demonstrate the viability and robustness of the new filter and its fast convergence rate. It realizes the inversion of combination shape with high accuracy, especially for the bus of cube and cylinder. Even though with sparse photometric data, it still can maintain a higher success rate of inversion.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a portable optical scanning device designed for skin surface measurement on both colour and 3D geometry through a relative easy and cost effective multiple light source photometric stereo method. The validation of colour recovered had been verified through its application on skin lesion segmentation in our early work. This paper focuses on the reconstructed topographic data which are subject to further evaluation and advancement. The evaluation work takes the skin in vitro as an application scenario and compares the experimental result to that obtained by using a commercial product. The experiments show that this handheld device can measure the skin profile significantly closer to that of the ground truth and have the additional function of skin colour recovery.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The morphology of wear particle is a fundamental indicator where wear oriented machine health can be assessed. Previous research proved that thorough measurement of the particle shape allows more reliable explanation of the occurred wear mechanism. However, most of current particle measurement techniques are focused on extraction of the two-dimensional (2-D) morphology, while other critical particle features including volume and thickness are not available. As a result, a three-dimensional (3-D) shape measurement method is developed to enable a more comprehensive particle feature description. The developed method is implemented in three steps: (1) particle profiles in multiple views are captured via a camera mounted above a micro fluid channel; (2) a preliminary reconstruction is accomplished by the shape-from-silhouette approach with the collected particle contours; (3) an iterative re-projection process follows to obtain the final 3-D measurement by minimizing the difference between the original and the re-projected contours. Results from real data are presented, demonstrating the feasibility of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Perceptual features, for example direction, contrast and repetitiveness, are important visual factors for human to perceive a texture. However, it needs to perform psychophysical experiment to quantify these perceptual features’ scale, which requires a large amount of human labor and time. This paper focuses on the task of obtaining perceptual features’ scale of textures by small number of textures with perceptual scales through a rating psychophysical experiment (what we call labeled textures) and a mass of unlabeled textures. This is the scenario that the semi-supervised learning is naturally suitable for. This is meaningful for texture perception research, and really helpful for the perceptual texture database expansion. A graph-based semi-supervised learning method called random multi-graphs, RMG for short, is proposed to deal with this task. We evaluate different kinds of features including LBP, Gabor, and a kind of unsupervised deep features extracted by a PCA-based deep network. The experimental results show that our method can achieve satisfactory effects no matter what kind of texture features are used.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the problem of complicated dynamic scenes in visual target tracking, a multi-feature fusion tracking algorithm based on covariance matrix is proposed to improve the robustness of the tracking algorithm. In the frame-work of quantum genetic algorithm, this paper uses the region covariance descriptor to fuse the color, edge and texture features. It also uses a fast covariance intersection algorithm to update the model. The low dimension of region covariance descriptor, the fast convergence speed and strong global optimization ability of quantum genetic algorithm, and the fast computation of fast covariance intersection algorithm are used to improve the computational efficiency of fusion, matching, and updating process, so that the algorithm achieves a fast and effective multi-feature fusion tracking. The experiments prove that the proposed algorithm can not only achieve fast and robust tracking but also effectively handle interference of occlusion, rotation, deformation, motion blur and so on.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Based on the previous research, this paper introduces the theory of wavelet, collects the situation through the video system, and calculates the key information needed in the fire protection system. That is, through the algorithm to collect the information, according to the flame color characteristics and smoke characteristics were extracted, and as the characteristic information corresponding processing. Alarm system set the corresponding alarm threshold, when more than this alarm threshold, the system will alarm. This combination of flame color characteristics and smoke characteristics of the fire method not only improve the accuracy of judgment, but also improve the efficiency of judgments. Experiments show that the scheme is feasible.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Dynamic Vision Sensor (DVS) is an event-based camera, which captures the changing pixel of vision. It captures the scene in the form of events. In this paper, we use a unique approach to visualize the events DVS captures with "DVS images". DVS is sensitive enough to capture objects moving in high speed, but noise is also captured. In order to improve the quality, we remove the noise of those images. Different from traditional images, the noise and objects in "DVS images" are both composed of distributed points. It is hard to use traditional methods to remove the noise. This paper proposes an efficient approach for "DVS image" noise removal. It is based on K-SVD algorithm and we improve the algorithm according to certain applications. The proposed framework can deal with "DVS images" containing different amount of noise. Experiments show that the proposed method can work well both on a fixed DVS and a moving DVS.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Visual Question Answering (VQA) is one of the most popular research fields in machine learning which aims to let the computer learn to answer natural language questions with images. In this paper, we propose a new method called hierarchical dynamic memory networks (HDMN), which takes both question attention and visual attention into consideration impressed by Co-Attention method, which is the best (or among the best) algorithm for now. Additionally, we use bi-directional LSTMs, which have a better capability to remain more information from the question and image, to replace the old unit so that we can capture information from both past and future sentences to be used. Then we rebuild the hierarchical architecture for not only question attention but also visual attention. What’s more, we accelerate the algorithm via a new technic called Batch Normalization which helps the network converge more quickly than other algorithms. The experimental result shows that our model improves the state of the art on the large COCO-QA dataset, compared with other methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we research the problem of person re-identification and propose a cross-domain latent space projection (CDLSP) method to address the problems of the absence or insufficient labeled data in the target domain. Under the assumption that the visual features in the source domain and target domain share the similar geometric structure, we transform the visual features from source domain and target domain to a common latent space by optimizing the object function defined in the manifold alignment method. Moreover, the proposed object function takes into account the specific knowledge in the re-id with the aim to improve the performance of re-id under complex situations. Extensive experiments conducted on four benchmark datasets show the proposed CDLSP outperforms or is competitive with stateof- the-art methods for person re-identification.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Model drift is an important reason for tracking failure. In this paper, multiple discriminative models with object proposals are used to improve the model discrimination for relieving this problem. Firstly, the target location and scale changing are captured by lots of high-quality object proposals, which are represented by deep convolutional features for target semantics. And then, through sharing a feature map obtained by a pre-trained network, ROI pooling is exploited to wrap the various sizes of object proposals into vectors of the same length, which are used to learn a discriminative model conveniently. Lastly, these historical snapshot vectors are trained by different lifetime models. Based on entropy decision mechanism, the bad model owing to model drift can be corrected by selecting the best discriminative model. This would improve the robustness of the tracker significantly. We extensively evaluate our tracker on two popular benchmarks, the OTB 2013 benchmark and UAV20L benchmark. On both benchmarks, our tracker achieves the best performance on precision and success rate compared with the state-of-the-art trackers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Due to the increasing globalization of printing industry, remoting proofing will become the inevitable development trend. Cross-media color reproduction will occur in different color gamuts using remote proofing technologies, which usually leads to the problem of incompatible color gamut. In this paper, to achieve equivalent color reproduction between a monitor and a printer, a frequency-based spatial gamut mapping algorithm is proposed for decreasing the loss of visual color information. The design of algorithm is based on the contrast sensitivity functions (CSF), which exploited CSF spatial filter to preserve luminance of the high spatial frequencies and chrominance of the low frequencies. First we show a general framework for how to apply CSF spatial filter in retention of relevant visual information. Then we compare the proposed framework with HPMINDE, CUSP, Bala’s algorithm. The psychophysical experimental results indicated the good performance of the proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Pedestrian detection (PD) is an important application domain in computer vision and pattern recognition. Unmanned Aerial Vehicles (UAVs) have become a major field of research in recent years. In this paper, an algorithm for a robust pedestrian detection method based on the combination of the infrared HOG (IR-HOG) feature and SVM is proposed for highly complex outdoor scenarios on the basis of airborne IR image sequences from UAV. The basic flow of our application operation is as follows. Firstly, the thermal infrared imager (TAU2-336), which was installed on our Outdoor Autonomous Searching (OAS) UAV, is used for taking pictures of the designated outdoor area. Secondly, image sequences collecting and processing were accomplished by using high-performance embedded system with Samsung ODROID-XU4 and Ubuntu as the core and operating system respectively, and IR-HOG features were extracted. Finally, the SVM is used to train the pedestrian classifier. Experiment show that, our method shows promising results under complex conditions including strong noise corruption, partial occlusion etc.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the development of information and communication technologies, new technologies are leading to an exponential increase in the volume and types of data available. At this time of the information society, data is one of the most important arguments for policy making, crisis management, research and education, and many other fields. An essential task for experts is to share high-quality data providing the right information at the right time. Designing of data presentation can largely influence the user perception and the cognitive aspects of data interpretation. Significant amounts of data can be visualised in some way. One image can thus replace a considerable number of numeric tables and texts. The paper focuses on the accurate visualisation of data from the point of view of used colour schemes. Bad choose of colours can easily confuse the user and lead to the data misinterpretation. On the contrary, correctly created visualisations can make information transfer much simpler and more efficient.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Detecting salient objects in images has been a fundamental problem in computer vision. In recent years, deep learning has shown its impressive performance in dealing with many kinds of vision tasks. In this paper, we propose a new method to detect salient objects by using Conditional Generative Adversarial Network (GAN). This type of network not only learns the mapping from RGB images to salient regions, but also learns a loss function for training the mapping. To the best of our knowledge, this is the first time that Conditional GAN has been used in salient object detection. We evaluate our saliency detection method on 2 large publicly available datasets with pixel accurate annotations. The experimental results have shown the significant and consistent improvements over the state-of-the-art method on a challenging dataset, and the testing speed is much faster.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Unreliable communication channels might lead to packet losses and bit errors in the videos transmitted through it, which will cause severe video quality degradation. This is even worse for HEVC since more advanced and powerful motion estimation methods are introduced to further remove the inter-frame dependency and thus improve the coding efficiency. Once a Motion Vector (MV) is lost or corrupted, it will cause distortion in the decoded frame. More importantly, due to motion compensation, the error will propagate along the motion prediction path, accumulate over time, and significantly degrade the overall video presentation quality. To address this problem, we study the problem of encoder-sider error resilient coding for HEVC and propose a constrained motion estimation scheme to mitigate the problem of error propagation to subsequent frames. The approach is achieved by cutting off MV dependencies and limiting the block regions which are predicted by temporal motion vector. The experimental results show that the proposed method can effectively suppress the error propagation caused by bit errors of motion vector and can improve the robustness of the stream in the bit error channels. When the bit error probability is 10-5, an increase of the decoded video quality (PSNR) by up to1.310dB and on average 0.762 dB can be achieved, compared to the reference HEVC.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To improve the efficiency of VP9 decoder, a novel parallel pipeline structure of VP9 decoder is presented in this paper. According to the decoding workflow, VP9 decoder can be divided into sub-modules which include entropy decoding, inverse quantization, inverse transform, intra prediction, inter prediction, deblocking and pixel adaptive compensation. By analyzing the computing time of each module, hotspot modules are located and the causes of low efficiency of VP9 decoder can be found. Then, a novel pipeline decoder structure is designed by using mixed parallel decoding methods of data division and function division. The experimental results show that this structure can greatly improve the decoding efficiency of VP9.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As the latest video coding standard, High Efficiency Video Coding (HEVC) achieves over 50% bit rate reduction with similar video quality compared with previous standards H.264/AVC. However, the higher compression efficiency is attained at the cost of significantly increasing computational load. In order to reduce the complexity, this paper proposes a fast coding unit (CU) partition technique to speed up the process. To detect the edge features of each CU, a more accurate improved Sobel filtering is developed and performed By analyzing the textural features of CU, an early CU splitting termination is proposed to decide whether a CU should be decomposed into four lower-dimensions CUs or not. Compared with the reference software HM16.7, experimental results indicate the proposed algorithm can lessen the encoding time up to 44.09% on average, with a negligible bit rate increase of 0.24%, and quality losses lower 0.03 dB, respectively. In addition, the proposed algorithm gets a better trade-off between complexity and rate-distortion among the other proposed works.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The simultaneous localization and mapping (SLAM) method based on the RGB-D sensor is widely researched in recent years. However, the accuracy of the RGB-D SLAM relies heavily on correspondence feature points, and the position would be lost in case of scenes with sparse textures. Therefore, plenty of fusion methods using the RGB-D information and inertial measurement unit (IMU) data have investigated to improve the accuracy of SLAM system. However, these fusion methods usually do not take into account the size of matched feature points. The pose estimation calculated by RGB-D information may not be accurate while the number of correct matches is too few. Thus, considering the impact of matches in SLAM system and the problem of missing position in scenes with few textures, a loose fusion method combining RGB-D with IMU is proposed in this paper. In the proposed method, we design a loose fusion strategy based on the RGB-D camera information and IMU data, which is to utilize the IMU data for position estimation when the corresponding point matches are quite few. While there are a lot of matches, the RGB-D information is still used to estimate position. The final pose would be optimized by General Graph Optimization (g2o) framework to reduce error. The experimental results show that the proposed method is better than the RGB-D camera’s method. And this method can continue working stably for indoor environment with sparse textures in the SLAM system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the simulation of natural terrain, the continuity of sample points are not in consonance with each other always, traditional interpolation methods often can't faithfully reflect the shape information which lie in data points. So, a new method for constructing the polynomial interpolation surface on triangular domain is proposed. Firstly, projected the spatial scattered data points onto a plane and then triangulated them; Secondly, A C1 continuous piecewise quadric polynomial patch was constructed on each vertex, all patches were required to be closed to the line-interpolation one as far as possible. Lastly, the unknown quantities were gotten by minimizing the object functions, and the boundary points were treated specially. The result surfaces preserve as many properties of data points as possible under conditions of satisfying certain accuracy and continuity requirements, not too convex meantime. New method is simple to compute and has a good local property, applicable to shape fitting of mines and exploratory wells and so on. The result of new surface is given in experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a fast and accurate dihedral interpolation Loop subdivision scheme for subdivision surfaces based on triangular meshes. In order to solve the problem of surface shrinkage, we keep the limit condition unchanged, which is important. Extraordinary vertices are handled using modified Butterfly rules. Subdivision schemes are computationally costly as the number of faces grows exponentially at higher levels of subdivision. To address this problem, our approach is to use local surface information to adaptively refine the model. This is achieved simply by changing the threshold value of the dihedral angle parameter, i.e., the angle between the normals of a triangular face and its adjacent faces. We then demonstrate the effectiveness of the proposed method for various 3D graphic triangular meshes, and extensive experimental results show that it can match or exceed the expected results at lower computational cost.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Synthetic aperture radar (SAR) image is independent on atmospheric conditions, and it is the ideal image source for change detection. Existing methods directly analysis all the regions in the speckle noise contaminated difference image. The performance of these methods is easily affected by small noisy regions. In this paper, we proposed a novel change detection framework for saliency-guided change detection based on pattern and intensity distinctiveness analysis. The saliency analysis step can remove small noisy regions, and therefore makes the proposed method more robust to the speckle noise. In the proposed method, the log-ratio operator is first utilized to obtain a difference image (DI). Then, the saliency detection method based on pattern and intensity distinctiveness analysis is utilized to obtain the changed region candidates. Finally, principal component analysis and k-means clustering are employed to analysis pixels in the changed region candidates. Thus, the final change map can be obtained by classifying these pixels into changed or unchanged class. The experiment results on two real SAR images datasets have demonstrated the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the process of geometric correction of remote sensing image, occasionally, a large number of redundant control points may result in low correction accuracy. In order to solve this problem, a control points filtering algorithm based on RANdom SAmple Consensus (RANSAC) was proposed. The basic idea of the RANSAC algorithm is that using the smallest data set possible to estimate the model parameters and then enlarge this set with consistent data points. In this paper, unlike traditional methods of geometric correction using Ground Control Points (GCPs), the simulation experiments are carried out to correct remote sensing images, which using visible stars as control points. In addition, the accuracy of geometric correction without Star Control Points (SCPs) optimization is also shown. The experimental results show that the SCPs’s filtering method based on RANSAC algorithm has a great improvement on the accuracy of remote sensing image correction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A raw synthetic aperture radar (SAR) image usually has a 16-bit or higher bit depth, which cannot be directly visualized on 8-bit displays. In this study, we propose a pseudo-color coding method for high-dynamic singlepolarization SAR images. The method considers the characteristics of both SAR images and human perception. In HSI (hue, saturation and intensity) color space, the method carries out high-dynamic range tone mapping and pseudo-color processing simultaneously in order to avoid loss of details and to improve object identifiability. It is a highly efficient global algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Finding the change in multi-temporal remote sensing image is important in many the image application. Because of the infection of climate and illumination, the texture of the ground object is more stable relative to the gray in high-resolution remote sensing image. And the texture features of Local Binary Patterns (LBP) and Speeded Up Robust Features (SURF) are outstanding in extracting speed and illumination invariance. A method of change detection for matched remote sensing image pair is present, which compares the similarity by LBP and SURF to detect the change and unchanged of the block after blocking the image. And region growing is adopted to process the block edge zone. The experiment results show that the method can endure some illumination change and slight texture change of the ground object.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The land surface temperature (LST) derived from thermal infrared satellite images is a meaningful variable in many remote sensing applications. However, at present, the spatial resolution of the satellite thermal infrared remote sensing sensor is coarser, which cannot meet the needs. In this study, LST image was downscaled by a random forest model between LST and multiple predictors in an arid region with an oasis-desert ecotone. The proposed downscaling approach was evaluated using LST derived from the MODIS LST product of Zhangye City in Heihe Basin. The primary result of LST downscaling has been shown that the distribution of downscaled LST matched with that of the ecosystem of oasis and desert. By the way of sensitivity analysis, the most sensitive factors to LST downscaling were modified normalized difference water index (MNDWI)/normalized multi-band drought index (NMDI), soil adjusted vegetation index (SAVI)/ shortwave infrared reflectance (SWIR)/normalized difference vegetation index (NDVI), normalized difference building index (NDBI)/SAVI and SWIR/NDBI/MNDWI/NDWI for the region of water, vegetation, building and desert, with LST variation (at most) of 0.20/-0.22 K, 0.92/0.62/0.46 K, 0.28/-0.29 K and 3.87/-1.53/-0.64/-0.25 K in the situation of ±0.02 predictor perturbances, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The processing of satellite imagery is dependent upon the quality of imagery. Due to low resolution, it is difficult to extract accurate information according to the requirements of applications. For the purpose of vehicle detection under shadow regions, we have used HOG for feature extraction, SVM is used for classification and HOG is discerned worthwhile tool for complex environments. Shadow images have been scrutinized and found very complex for detection as observed very low detection rates therefore our dedication is towards enhancement of detection rate under shadow regions by implementing appropriate preprocessing. Vehicles are precisely detected under non-shadow regions with high detection rate than shadow regions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the information environment, digital and information processing to Li brocade patterns reveals an important means of Li ethnic style and inheriting the national culture. Adobe Illustrator CS3 and Java language were used in the paper to make "variation" processing to Li brocade patterns, and generate "Li brocade pattern mutant genes". The generation of pattern mutant genes includes color mutation, shape mutation, adding and missing transform, and twisted transform, etc. Research shows that Li brocade pattern mutant genes can be generated by using the Adobe Illustrator CS3 and the image processing tools of Java language edit, etc.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The paper describes a novel method of the image processing based on irregular triangular meshes implementation. The triangular mesh is adaptive to the image content, least mean square linear approximation is proposed for the basic interpolation within the triangle. It is proposed to use triangular numbers to simplify using of the local (barycentric) coordinates for the further analysis - triangular element of the initial irregular mesh is to be represented through the set of the four equilateral triangles. This allows to use fast and simple pixels indexing in local coordinates, e.g. “for” or “while” loops for access to the pixels. Moreover, representation proposed allows to use discrete cosine transform of the simple “rectangular” symmetric form without additional pixels reordering (as it is used for shape-adaptive DCT forms). Furthermore, this approach leads to the simple form of the wavelet transform on triangular mesh. The results of the method application are presented. It is shown that advantage of the method proposed is a combination of the flexibility of the image-adaptive irregular meshes with the simple form of the pixel indexing in local triangular coordinates and the using of the common forms of the discrete transforms for triangular meshes. Method described is proposed for the image compression, pattern recognition, image quality improvement, image search and indexing. It also may be used as a part of video coding (intra-frame or inter-frame coding, motion detection).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Due to improper preservation, traditional films will appear frame loss after digital. To deal with this problem, this paper presents a new adaptive patch-based method of frame interpolation via the guidance of motion paths. Our method is divided into three steps. Firstly, we compute motion paths between two reference frames using optical flow estimation. Then, the adaptive bidirectional interpolation with holes filled is applied to generate pre-intermediate frames. Finally, using patch match to interpolate intermediate frames with the most similar patches. Since the patch match is based on the pre-intermediate frames that contain the motion paths constraint, we show a natural and inartificial frame interpolation. We test different types of old film sequences and compare with other methods, the results prove that our method has a desired performance without hole or ghost effects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Running is becoming one of the most popular exercises among the people, monitoring steps can help users better understand their running process and improve exercise efficiency. In this paper, we design and implement a robust and unobtrusive algorithm based on position independence for step detection under real environment. It applies Butterworth filter to suppress high frequency interference and then employs the projection based on mathematics to transform system to solve the problem of unknown position of smartphone. Finally, using sliding window to suppress the false peak. The algorithm was tested for eight participants on the Android 7.0 platform. In our experiments, the results show that the proposed algorithm can achieve desired effect in spite of device pose.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The slot angle of fiber fixed chip has a significant impact on performance of photoelectric devices. In order to solve the actual engineering problem, this paper put forward a detecting method based on imaging processing. Because the images have very low contrast that is hardly segmented, so this paper proposes imaging segment methods based on edge character. Then get fixed chip edge line slope k2 and calculate the fiber fixed slot line slope k1, which can be used calculating the slot angle. Lastly, test the repeatability and accuracy of system, which show that this method has very fast operation speed and good robustness. Clearly, it is also satisfied to the actual demand of fiber fixed chip slot angle detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In modern industry, the nondestructive testing of printed circuit board (PCB) can prevent effectively the system failure and is becoming more and more important. In order to detect the via in the PCB base on the CT image automatically,accurately and reliably, a novel algorithm for via extraction based on weighting stack combining the morphologic character of via is designed. Every slice data in the vertical direction of the PCB is superimposed to enhanced vias target. The OTSU algorithm is used to segment the slice image. OTSU algorithm of thresholding gray level images is efficient for separating an image into two classes where two types of fairly distinct classes exist in the image. Randomized Hough Transform was used to locate the region of via in the segmented binary image. Then the 3D reconstruction of via based on sequence slice images was done by volume rendering. The accuracy of via positioning and detecting from a CT images of PCB was demonstrated by proposed algorithm. It was found that the method is good in veracity and stability for detecting of via in three dimensional.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For the random hopping frequency signal, the modulated frequencies are randomly distributed over given bandwidth. The randomness of modulated frequency not only improves the electronic counter countermeasure capability for radar systems, but also determines its performance of range compression. In this paper, the range ambiguity function of RHF signal is firstly derived. Then, a design method of frequency hopping pattern based on stationary phase principle to improve the peak to side-lobe ratio is proposed. Finally, the simulated experiments show a good effectiveness of the presented design method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Depth cameras are currently playing an important role in many areas. However, most of them can only obtain lowresolution (LR) depth images. Color cameras can easily provide high-resolution (HR) color images. Using color image as a guide image is an efficient way to get a HR depth image. In this paper, we propose a depth image super resolution (SR) algorithm, which uses a HR color image as a guide image and a LR depth image as input. We use the fusion filter of guided filter and edge based joint bilateral filter to get HR depth image. Our experimental results on Middlebury 2005 datasets show that our method can provide better quality in HR depth images both numerically and visually.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The efficient scene management of virtual environment is an important research content of computer real-time visualization, which has a decisive influence on the efficiency of drawing. However, Traditional scene management methods do not suitable for complex virtual battlefield environments, this paper combines the advantages of traditional scene graph technology and spatial data structure method, using the idea of management and rendering separation, a loose object-oriented scene graph structure is established to manage the entity model data in the scene, and the performance-based quad-tree structure is created for traversing and rendering. In addition, the collaborative update relationship between the above two structural trees is designed to achieve efficient scene management. Compared with the previous scene management method, this method is more efficient and meets the needs of real-time visualization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Based on a fact that traditional tactical information display technology suffers from disadvantages of a large number of data to be transferred and low plotting efficiency in an interactive virtual cockpit, a GID protocol-based simulation has been designed. This method dissolves complex tactical information screens into basic plotting units. The indication of plotting units is controlled via the plotting commands, which solves the incompatibility between the tactical information display in traditional simulation and the desktop-based virtual simulation training system. Having been used in desktop systems for helicopters, fighters, and transporters, this method proves to be scientific and reasonable in design and simple and efficient in usage, which exerts a significant value in establishing aviation equipment technology support training products.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a sparse feature matching method based on modified RANSAC algorithm is proposed to improve the precision and speed. Firstly, the feature points of the images are extracted using the SIFT algorithm. Then, the image pair is matched roughly by generating SIFT feature descriptor. At last, the precision of image matching is optimized by the modified RANSAC algorithm,. The RANSAC algorithm is improved from three aspects: instead of the homography matrix, this paper uses the fundamental matrix generated by the 8 point algorithm as the model; the sample is selected by a random block selecting method, which ensures the uniform distribution and the accuracy; adds sequential probability ratio test(SPRT) on the basis of standard RANSAC, which cut down the overall running time of the algorithm. The experimental results show that this method can not only get higher matching accuracy, but also greatly reduce the computation and improve the matching speed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Super-pixel extraction techniques group pixels to form over-segmented image blocks according to the similarity among pixels. Compared with the traditional pixel-based methods, the image descripting method based on super-pixel has advantages of less calculation, being easy to perceive, and has been widely used in image processing and computer vision applications. Pulse coupled neural network (PCNN) is a biologically inspired model, which stems from the phenomenon of synchronous pulse release in the visual cortex of cats. Each PCNN neuron can correspond to a pixel of an input image, and the dynamic firing pattern of each neuron contains both the pixel feature information and its context spatial structural information. In this paper, a new color super-pixel extraction algorithm based on multi-channel pulse coupled neural network (MPCNN) was proposed. The algorithm adopted the block dividing idea of SLIC algorithm, and the image was divided into blocks with same size first. Then, for each image block, the adjacent pixels of each seed with similar color were classified as a group, named a super-pixel. At last, post-processing was adopted for those pixels or pixel blocks which had not been grouped. Experiments show that the proposed method can adjust the number of superpixel and segmentation precision by setting parameters, and has good potential for super-pixel extraction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lung cancer is the leading cause of cancer-related deaths among men. In this paper, we propose a pulmonary nodule detection method for early screening of lung cancer based on the improved AlexNet model. In order to maintain the same image quality as the existing B/S architecture PACS system, we convert the original CT image into JPEG format image by analyzing the DICOM file firstly. Secondly, in view of the large size and complex background of CT chest images, we design the convolution neural network on basis of AlexNet model and sparse convolution structure. At last we train our models on the software named DIGITS which is provided by NVIDIA. The main contribution of this paper is to apply the convolutional neural network for the early screening of lung cancer and improve the screening accuracy by combining the AlexNet model with the sparse convolution structure. We make a series of experiments on the chest CT images using the proposed method, of which the sensitivity and specificity indicates that the method presented in this paper can effectively improve the accuracy of early screening of lung cancer and it has certain clinical significance at the same time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, several models based on CNN architecture have achieved great result on Single Image Super-Resolution (SISR) problem. In this paper, we propose an image super-resolution method (SR) using a light inception layer in convolutional network (LICN). Due to the strong representation ability of our well-designed inception layer that can learn richer representation with less parameters, we can build our model with shallow architecture that can reduce the effect of vanishing gradients problem and save computational costs. Our model strike a balance between computational speed and the quality of the result. Compared with state-of-the-art result, we produce comparable or better results with faster computational speed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent study improves descriptor performance by accumulating stability votes for all scale pairs to compose the local descriptor. We argue that the stability of a bin depends on the differences across adjacent pairs more than the differences across all scale pairs, and a new local descriptor is composed based on the hypothesis. A series of SIFT descriptors are extracted from multiple scales firstly. Then the difference value of the bin across adjacent scales is calculated, and the stability value of a bin is calculated based on it and accumulated to compose the final descriptor. The performance of the proposed method is evaluated with two popular matching datasets, and compared with other state-of-the-art works. Experimental results show that the proposed method performs satisfactorily.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hash coding is a widely used technique in approximate nearest neighbor (ANN) search, especially in document search and multimedia (such as image and video) retrieval. Based on the difference of distance measurement, hash methods are generally classified into two categories: Hamming hashing and Manhattan hashing. Benefitting from better neighborhood structure preservation, Manhattan hashing methods outperform earlier methods in search effectiveness. However, due to using decimal arithmetic operations instead of bit operations, Manhattan hashing becomes a more time-consuming process, which significantly decreases the whole search efficiency. To solve this problem, we present an intuitive hash scheme which uses Flat Binary Code (FBC) to encode the data points. As a result, the decimal arithmetic used in previous Manhattan hashing can be replaced by more efficient XOR operator. The final experiments show that with a reasonable memory space growth, our FBC speeds up more than 80% averagely without any search accuracy loss when comparing to the state-of-art Manhattan hashing methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a robust feature extraction based digital image watermarking method using Distortion- Compensated Dither Modulation (DC-DM). Our proposed local watermarking method provides stronger robustness and better flexibility than traditional global watermarking methods. We improve robustness by introducing feature extraction and DC-DM method. To extract the robust feature points, we propose a DAISY-based Robust Feature Extraction (DRFE) method by employing the DAISY descriptor and applying the entropy calculation based filtering. The experimental results show that the proposed method achieves satisfactory robustness under the premise of ensuring watermark imperceptibility quality compared to other existing methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to increase the accuracy of cloud detection for remote sensing satellite imagery, we propose an efficient cloud detection method for remote sensing satellite panchromatic images. This method includes three main steps. First, an adaptive intensity threshold value combined with a median filter is adopted to extract the coarse cloud regions. Second, a guided filtering process is conducted to strengthen the textural features difference and then we conduct the detection process of texture via gray-level co-occurrence matrix based on the acquired texture detail image. Finally, the candidate cloud regions are extracted by the intersection of two coarse cloud regions above and we further adopt an adaptive morphological dilation to refine them for thin clouds in boundaries. The experimental results demonstrate the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A typical texture retrieval system performs feature comparison and might not be able to make human-like judgments of image similarity. Meanwhile, it is commonly known that perceptual texture similarity is difficult to be described by traditional image features. In this paper, we propose a new texture retrieval scheme based on texture perceptual similarity. The key of the proposed scheme is that prediction of perceptual similarity is performed by learning a non-linear mapping from image features space to perceptual texture space by using Random Forest. We test the method on natural texture dataset and apply it on a new wallpapers dataset. Experimental results demonstrate that the proposed texture retrieval scheme with perceptual similarity improves the retrieval performance over traditional image features.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In iris recognition system, the clarity of iris image is an important factor that influences recognition effect. In the process of recognition, the blurred image may possibly be rejected by the automatic iris recognition system, which will lead to the failure of identification. Therefore it is necessary to evaluate the iris image definition before recognition. Considered the existing evaluation methods on iris image definition, we proposed a fast algorithm to evaluate the definition of iris image in this paper. In our algorithm, firstly ROI (Region of Interest) is extracted based on the reference point which is determined by using the feature of the light spots within the pupil, then Tenengrad operator is used to evaluate the iris image’s definition. Experiment results show that, the iris image definition algorithm proposed in this paper could accurately distinguish the iris images of different clarity, and the algorithm has the merit of low computational complexity and more effectiveness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image quality assessment is needed in multiple image processing areas and blur is one of the key reasons of image deterioration. Although great full-reference image quality assessment metrics have been proposed in the past few years, no-reference method is still an area of current research. Facing this problem, this paper proposes a no-reference sharpness assessment method based on wavelet transformation which focuses on the edge area of image. Based on two simple characteristics of human vision system, weights are introduced to calculate weighted log-energy of each wavelet sub band. The final score is given by the ratio of high-frequency energy to the total energy. The algorithm is tested on multiple databases. Comparing with several state-of-the-art metrics, proposed algorithm has better performance and less runtime consumption.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Metal corrosion can cause many problems, how to quickly and effectively assess the grade of metal corrosion and timely remediation is a very important issue. Typically, this is done by trained surveyors at great cost. Assisting them in the inspection process by computer vision and artificial intelligence would decrease the inspection cost. In this paper, we propose a dataset of metal surface correction used for computer vision detection and present a comparison between standard computer vision techniques by using OpenCV and deep learning method for automatic metal surface corrosion grade estimation from single image on this dataset. The test has been performed by classifying images and calculating the accuracy for the two different approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We develop a no-reference image quality assessment metric to evaluate the quality of synthesized view rendered from the Multi-view Video plus Depth (MVD) format. Our metric is named Synthesized View Comparison (SVC), which is designed for real-time quality monitoring at the receiver side in a 3D-TV system. The metric utilizes the virtual views in the middle which are warped from left and right views by Depth-image-based rendering algorithm (DIBR), and compares the difference between the virtual views rendered from different cameras by Structural SIMilarity (SSIM), a popular 2D full-reference image quality assessment metric. The experimental results indicate that our no-reference quality assessment metric for the synthesized images has competitive prediction performance compared with some classic full-reference image quality assessment metrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the development of wireless networks and intelligent terminals, video traffic has increased dramatically. Adaptive video streaming has become one of the most promising video transmission technologies. For this type of service, a good QoS (Quality of Service) of wireless network does not always guarantee that all customers have good experience. Thus, new quality metrics have been widely studies recently. Taking this into account, the objective of this paper is to investigate the quality metrics of wireless adaptive video streaming. In this paper, a wireless video streaming simulation platform with DASH mechanism and multi-rate video generator is established. Based on this platform, PSNR model, SSIM model and Quality Level model are implemented. Quality Level Model considers the QoE (Quality of Experience) factors such as image quality, stalling and switching frequency while PSNR Model and SSIM Model mainly consider the quality of the video. To evaluate the performance of these QoE models, three performance metrics (SROCC, PLCC and RMSE) which are used to make a comparison of subjective and predicted MOS (Mean Opinion Score) are calculated. From these performance metrics, the monotonicity, linearity and accuracy of these quality metrics can be observed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Guided image filter has been widely used in image processing. Considering the Non-local model is an excellent method for global information accumulation, the non-local image guided filter has been proposed and shown good performance in many image processing tasks by utilizing the non-local similarity of the guidance image. In this paper, we introduce a shadowed non-local image guided filter derived from the concept of shadowed sets. The shadowed non-local model applies more reliable non-local information by suppressing the low similarity values of the guidance image to zero and boosting high similarity values to the maximum of the non-local similarity set. The thresholds of suppression and boosting are determined automatically based on the concept of shadowed sets. Experimental results on several image processing tasks including image denoising, depth super-resolution, and image dehazing demonstrate the superiority of shadowed set based approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Visual tracking is a challenging problem, especially using a single model. In this paper, we propose a discriminative correlation filter (DCF) based tracking approach that exploits both the long-term and short-term information of the target, named LSTDCF, to improve the tracking performance. In addition to a long-term filter learned through the whole sequence, a short-term filter is trained using only features extracted from most recent frames. The long-term filter tends to capture more semantics of the target as more frames are used for training. However, since the target may undergo large appearance changes, features extracted around the target in non-recent frames prevent the long-term filter from locating the target in the current frame accurately. In contrast, the short-term filter learns more spatial details of the target from recent frames but gets over-fitting easily. Thus the short-term filter is less robust to handle cluttered background and prone to drift. We take the advantage of both filters and fuse their response maps to make the final estimation. We evaluate our approach on a widely-used benchmark with 100 image sequences and achieve state-of-the-art results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Direct Position Determination (DPD) algorithm has been demonstrated to achieve a better accuracy with known signal waveforms. However, the signal waveform is difficult to be completely known in the actual positioning process. To solve the problem, we proposed a DPD method for digital modulation signals based on improved particle swarm optimization algorithm. First, a DPD model is established for known modulation signals and a cost function is obtained on symbol estimation. Second, as the optimization of the cost function is a nonlinear integer optimization problem, an improved Particle Swarm Optimization (PSO) algorithm is considered for the optimal symbol search. Simulations are carried out to show the higher position accuracy of the proposed DPD method and the convergence of the fitness function under different inertia weight and population size. On the one hand, the proposed algorithm can take full advantage of the signal feature to improve the positioning accuracy. On the other hand, the improved PSO algorithm can improve the efficiency of symbol search by nearly one hundred times to achieve a global optimal solution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Quadrature and multi-channel amplitude-phase error have to be compensated in the I/Q quadrature sampling and signal through multi-channel. A new method that it doesn’t need filter and standard signal is presented in this paper. And it can combined estimate quadrature and multi-channel amplitude-phase error. The method uses cross-correlation and amplitude ratio between the signal to estimate the two amplitude-phase errors simply and effectively. And the advantages of this method are verified by computer simulation. Finally, the superiority of the method is also verified by measure data of outfield experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A KST (Kolmogorov–Smirnov test and T statistic) method is used for construction of a correlation network based on the fluctuation of each time series within the multivariate time signals. In this method, each time series is divided equally into multiple segments, and the maximal data fluctuation in each segment is calculated by a KST change detection procedure. Connections between each time series are derived from the data fluctuation matrix, and are used for construction of the fluctuation correlation network (FCN). The method was tested with synthetic simulations and the result was compared with those from using KS or T only for detection of data fluctuation. The novelty of this study is that the correlation analyses was based on the data fluctuation in each segment of each time series rather than on the original time signals, which would be more meaningful for many real world applications and for analysis of large-scale time signals where prior knowledge is uncertain.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An omnidirectional mobile platform is designed for building point cloud based on an improved filtering algorithm which is employed to handle the depth image. First, the mobile platform can move flexibly and the control interface is convenient to control. Then, because the traditional bilateral filtering algorithm is time-consuming and inefficient, a novel method is proposed which called local bilateral filtering (LBF). LBF is applied to process depth image obtained by the Kinect sensor. The results show that the effect of removing noise is improved comparing with the bilateral filtering. In the condition of off-line, the color images and processed images are used to build point clouds. Finally, experimental results demonstrate that our method improves the speed of processing time of depth image and the effect of point cloud which has been built.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The operation of the disconnector in the gas insulated substation (GIS) may produce very fast transient overvoltage (VFTO), which has the characteristics of short rise time, short duration, high amplitude and rich frequency components. VFTO can cause damage to GIS and secondary equipment, and the frequency components contained in the VFTO can cause resonance overvoltage inside the transformer, so it is necessary to study the spectral characteristics of the VFTO. From the perspective of signal processing, VFTO is a kind of non-stationary signal, the traditional Fourier transform is difficult to describe its frequency which changes with time, so it is necessary to use time-frequency analysis to analyze VFTO spectral characteristics. In this paper, we analyze the performance of short time Fourier transform (STFT), Wigner-Ville distribution (WVD), pseudo Wigner-Ville distribution (PWVD) and smooth pseudo Wigner-Ville distribution (SPWVD). The results show that SPWVD transform is the best. The time-frequency aggregation of SPWVD is higher than STFT, and it does not have cross-interference terms, which can meet the requirements of VFTO spectrum analysis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Astronomical time series analysis is one of the hottest and most important problems, and becomes the suitable way to deal with the underlying dynamical behavior of the considered nonlinear systems. The quasi-periodic analysis of solar magnetic activity has been carried out by various authors during the past fifty years. In this work, the novel Hilbert-Huang transform approach is applied to investigate the yearly numbers of polar faculae in the time interval from 1705 to 1999. The detected periodicities can be allocated to three components: the first one is the short-term variations with periods smaller than 11 years, the second one is the mid- term variations with classical periods from 11 years to 50 years, and the last one is the long-term variations with periods larger than 50 years. The analysis results improve our knowledge on the quasi-periodic variations of solar magnetic activity and could be provided valuable constraints for solar dynamo theory. Furthermore, our analysis results could be useful for understanding the long-term variations of solar magnetic activity, providing crucial information to describe and forecast solar magnetic activity indicators.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the combination of digital beamforming (DBF) processing, multichannel synthetic aperture radar(SAR) systems in azimuth promise well in high-resolution and wide-swath imaging, whereas conventional processing methods don’t take the nonuniformity of scattering coefficient into consideration. This paper brings up a robust adaptive Multichannel SAR processing method which utilizes the Capon spatial spectrum estimator to obtain the spatial spectrum distribution over all ambiguous directions first, and then the interference-plus-noise covariance Matrix is reconstructed based on definition to acquire the Multichannel SAR processing filter. The performance of processing under nonuniform scattering coefficient is promoted by this novel method and it is robust again array errors. The experiments with real measured data demonstrate the effectiveness and robustness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neurodegenerative diseases (NDs) usually cause gait disorders and postural disorders, which provides an important basis for NDs diagnosis. By observing and analyzing these clinical manifestations, medical specialists finally give diagnostic results to the patient, which is inefficient and can be easily affected by doctors' subjectivity. In this paper, we propose a two-layer Long Short-Term Memory (LSTM) model to learn the gait patterns exhibited in the three NDs. The model was trained and tested using temporal data that was recorded by force-sensitive resistors including time series, such as stride interval and swing interval. Our proposed method outperforms other methods in literature in accordance with accuracy of the predicted diagnostic result. Our approach aims at providing the quantitative assessment so that to indicate the diagnosis and treatment of these neurodegenerative diseases in clinic
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Target feature extraction plays an important role in pattern recognition. It is the most complicated activity in the brain mechanism of biological vision. Inspired by high properties of primary visual cortex (V1) in extracting dynamic and static features, a visual perception model was raised. Firstly, 28 spatial-temporal filters with different orientations, half-squaring operation and divisive normalization were adopted to obtain the responses of V1 simple cells; then, an adjustable parameter was added to the output weight so that the response of complex cells was got. Experimental results indicate that the proposed V1 model can perceive motion information well. Besides, it has a good edge detection capability. The model inspired by V1 has good performance in feature extraction and effectively combines brain-inspired intelligence with computer vision.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The blood echo signal maintained through Medical ultrasound Doppler devices would always include vascular wall pulsation signal .The traditional method to de-noise wall signal is using high-pass filter, which will also remove the lowfrequency part of the blood flow signal. Some scholars put forward a method based on region selective reduction, which at first estimates of the wall pulsation signals and then removes the wall signal from the mixed signal. Apparently, this method uses the correlation between wavelet coefficients to distinguish blood signal from wall signal, but in fact it is a kind of wavelet threshold de-noising method, whose effect is not so much ideal. In order to maintain a better effect, this paper proposes an improved method based on wavelet coefficient correlation to separate blood signal and wall signal, and simulates the algorithm by computer to verify its validity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Computer Information Engineering and Weather Forecasting
Convective storm nowcasting refers to the prediction of the convective weather initiation, development, and decay in a very short term (typically 0 ~ 2 h) .Despite marked progress over the past years, severe convective storm nowcasting still remains a challenge. With the boom of machine learning, it has been well applied in various fields, especially convolutional neural network (CNN). In this paper, we build a servere convective weather nowcasting system based on CNN and hidden Markov model (HMM) using reanalysis meteorological data. The goal of convective storm nowcasting is to predict if there is a convective storm in 30min. In this paper, we compress the VDRAS reanalysis data to low-dimensional data by CNN as the observation vector of HMM, then obtain the development trend of strong convective weather in the form of time series. It shows that, our method can extract robust features without any artificial selection of features, and can capture the development trend of strong convective storm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Nowcasting or very short-term forecasting convective storms is still a challenging problem due to the high nonlinearity and insufficient observation of convective weather. As the understanding of the physical mechanism of convective weather is also insufficient, the numerical weather model cannot predict convective storms well. Machine learning approaches provide a potential way to nowcast convective storms using various meteorological data. In this study, a deep belief network (DBN) is proposed to nowcast convective storms using the real-time re-analysis meteorological data. The nowcasting problem is formulated as a classification problem. The 3D meteorological variables are fed directly to the DBN with dimension of input layer 6*6*80. Three hidden layers are used in the DBN and the dimension of output layer is two. A box-moving method is presented to provide the input features containing the temporal and spatial information. The results show that the DNB can generate reasonable prediction results of the movement and growth of convective storms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
GL Studio cannot display Chinese characters during developing the airborne MFD, this paper propose a method of establishing a Chinese character font with GB2312 encoding, establish the font table and the display unit of Chinese characters based on GL Studio. Abstract the storage and display data model of Chinese characters, parse the GB encoding of the corresponding Chinese characters that MFD received, find the coordinates of the Chinese characters in the font table, establish the dynamic control model and the dynamic display model of Chinese characters based on the display unit of Chinese characters. In GL Studio and VC ++.NET environment, this model has been successfully applied to develop the airborne MFD in a variety of mission simulators. This method has successfully solved the problem that GL Studio software cannot develop MFD software of Chinese domestic aircraft and can also be used for other professional airborne MFD development tools such as IDATA. It has been proved by experiments that this is a fast effective scalable and reconfigurable method of developing both actual equipment and simulators.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a visual interactive analysis approach for tropical cyclone trajectory prediction based on the support vector machine (SVM) regression method. We design a visual analysis interface that supports training data selection, model parameters adjustment and the visual assessment of model quality. This visual analysis approach can facilitate the prediction process and enable users to predict tropical cyclone trajectory easily. A case study with real data demonstrates the effectiveness of our approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The detection of clusters is benefit for understanding the organizations and functions of networks. Clusters, or communities, are usually groups of nodes densely interconnected but sparsely linked with any other clusters. To identify communities, an efficient and effective community agglomerative algorithm based on node similarity is proposed. The proposed method initially calculates similarities between each pair of nodes, and form pre-partitions according to the principle that each node is in the same community as its most similar neighbor. After that, check each partition whether it satisfies community criterion. For the pre-partitions who do not satisfy, incorporate them with others that having the biggest attraction until there are no changes. To measure the attraction ability of a partition, we propose an attraction index that based on the linked node’s importance in networks. Therefore, our proposed method can better exploit the nodes’ properties and network’s structure. To test the performance of our algorithm, both synthetic and empirical networks ranging in different scales are tested. Simulation results show that the proposed algorithm can obtain superior clustering results compared with six other widely used community detection algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Smart automation in industries has become very important as it can improve the reliability and efficiency of the systems. The use of smart technologies in agriculture have increased over the year to ensure and control the production of crop and address food security. However, it is important to use proper irrigation systems avoid water wastage and overfeeding of the plant. In this paper, a Smart Rule-based Automated Fertilization and Irrigation System is proposed and evaluated. We propose a rule based decision making algorithm to monitor and control the food supply to the plant and the soil quality. A build-in alert system is also used to update the farmer using a text message. The system is developed and evaluated using a real hardware.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In an environment people have to make sure that all of their work are completed within a given time in accordance with its quality. In order to achieve the real phenomenon of process mining one needs to understand all of these processes in a detailed manner. Personal Information and communication has always been a highlighting issue on internet but for now information and communication tools within factual life refers to their daily schedule, location analysis, environmental analysis and, more generally, social media applications support these systems which makes data available for data analysis generated through event logs, but also for process analysis which combines environmental and location analysis. Process mining can be used to exploit all these real live processes with the help of the event logs which are already available in those datasets through user censored data or may be user labeled data. These processes could be used to redesign a user’s flow and understand all these processes in a bit more detailed manner. In order to increase the quality of each of the processes that we go through our daily lives is to give a closer look to each of the processes and after analyzing them, one should make changes to get better results. On the contrarily, we applied process mining techniques on seven different subjects combined in a single dataset collected from Korea. Above all, the following paper comments on the efficiency of processes in the event logs referring to time management’s sphere of influence.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It has become a trend that worldwide enterprises and organizations apply new technologies to improve their operations; besides, it has higher cost and less flexibility to construct and manage traditional servers, therefore the current mainstream is to use server virtualization technology. However, from these new technology organizations will not necessarily get the expected benefits because each one has its own level of organizational complexity and abilities to accept changes. The researcher investigated key factors affecting the adoption of virtualization technology through two phases. In phase I, the researcher reviewed literature and then applied the dimensions of “Information Systems Success Model” (ISSM) to generalize the factors affecting the adoption of virtualization technology to be the preliminary theoretical framework and develop a questionnaire; in phase II, a three-round Delphi Method was used to integrate the opinions of experts from related fields which were then gradually converged in order to obtain a stable and objective questionnaire of key factors so that these results were expected to provide references for organizations’ adoption of server virtualization technology and future studies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Along with the wide application of the Deep Learning in the field of Computer vision, Deep learning has become a mainstream direction in the field of object tracking. The tracking algorithm in this paper is based on the improved multidomain convolution neural network, and the VOT video set is pre-trained on the network by multi-domain training strategy. In the process of online tracking, the network evaluates candidate targets sampled from vicinity of the prediction target in the previous with Gaussian distribution, and the candidate target with the highest score is recognized as the prediction target of this frame. The Bounding Box Regression model is introduced to make the prediction target closer to the ground-truths target box of the test set. Grouping-update strategy is involved to extract and select useful update samples in each frame, which can effectively prevent over fitting. And adapt to changes in both target and environment. To improve the speed of the algorithm while maintaining the performance, the number of candidate target succeed in adjusting dynamically with the help of Self-adaption parameter Strategy. Finally, the algorithm is tested by OTB set, compared with other high-performance tracking algorithms, and the plot of success rate and the accuracy are drawn. which illustrates outstanding performance of the tracking algorithm in this paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.