With the increasing application of unmanned aerial vehicles (UAVs) in all walks of life, the demand for autonomous navigation of UAVs has become increasingly urgent, especially in areas where there is no GPS signal or where the GPS signal is interfered with. This paper proposes an integrated navigation method based on visual navigation and inertial navigation. First, it leverages the satellite map as the reference map. During the flight of the UAV, the camera takes pictures of the ground at intervals. And then it matches the photo of the UAV and the reference map. In this way, the visual navigation system localizes the UAV and evalutate the the reliability of the localization. Finally the Particle Filter algorithm is introduced to fuse the positioning results. To expedite the matching process, we utilize the INS to narrow down the range of satellite maps for matching. Considering the differences between the camera photo and the reference map, this paper introduces SuperPoint and SuperGlue algorithms for feature extraction and matching, respectively. These two algorithms utilize deep neural networks to extract and match the features of the images, enabling the extraction of deep semantic features rather than manual features. Experiment results demonstrate the superior matching effectiveness of the image matching algorithm introduced in this paper. The simulation results show that the cumulative error of INV is greatly reduced after fusing with the results of visual navigation. Since it navigates autonomously, the integrated navigation method offers robust anti-interference capabilities, high autonomy, and better adaptability.
Most of the edge devices have restricted computational resource, such as ASIC, FPGA or other embedded systems, which cause an efficient problem for neural network model to run in these hardware platform. Model quantization is an effective optimization technique for convolutional layer inference of neural network at the cost of little accuracy loss. However, most of quantized methods only accelerate the computation of convolutional layer, other layers of a model are still inferred by floating-point calculation. FPGA is not an applicable platform for floating-point calculation. In this paper, a completely quantized method is proposed for inference of neural network on FPGA platform. All the calculation of a model inference is performed by quantized value. More quantization leads to more accuracy loss. In order to preserve accuracy, several techniques are used for different functional layer of the neural model. Such as activation layer uses bitwise operation instead of mutilation, concatenate layer use respective parameter for different input layer. To evaluate the effectiveness and efficiency of the proposed method, we implement a quantized light weight detection network, and deploying it on FPGA platform. The experimental result demonstrates that our quantized method is a very low accuracy loss method and is high efficient for neural network inference on FPGA platform. The proposed quantized inference method is highly beneficial for neural model to deploy on low power consumption devices.
The prevalent deep learning approach achieve a great success in many detection task. However, due to the limited features and complicated background, it is still a challenge to apply it to small target detection in infrared image. In this paper, a novel method based on convolutional neural network is proposed to solve the small target detection problem. Firstly, the image feed to neural network is preprocessed in order to enhance the target characteristic by encompassing space and time information. Then the spatial-temporal datum is used to train a custom designed lightweight network dedicated to small target detection. At last, the well trained model is used for inference of infrared video. Furthermore, several tricks are also employed to improve the efficiency of the network so that it is able to operate in real time .The experimental result demonstrate the presented method have achieved decent performance on small target detection task.
Prevailing object detection algorithms such as RCNN, YOLO and SSD usually are not suitable for high definition surveillance systems because of the fixed size of network input and masses of object candidate regions in the select search process. This paper proposes a fast moving object detection and recognition method for video surveillance system, which applies background extraction and frame difference to fulfill select search process, followed by a pretrained CNN model inference to complete object recognition. Proposed method was proved to be fast and effective in our experimental results which is more suitable for moving object detection for video surveillance system, compared to current other object detection algorithms.
Visual object tracking is one of the most attractive issue in computer vision. Recently, deep neural network has been widely developed in object tracking and showing great accuracy. In general, the accuracy of tracking task decreases dramatically when the background becomes complex or occluded. Thus, a robust tracking method based on convolutional neural network and anti-occlusion mechanic is presented. Benefit from the adaptive tracking confidence parameter T, the tracking effect is evaluated during tracking. Once the target is occluded, the location of the target object is corrected immediately. Experimental results demonstrate that the proposed framework achieves state-of-the-art performance on the popular OTB50 and OTB100 benchmarks.
Pedestrian detection is the major task of many infrared surveillance system. Due to the technical limitation of sensor or the high cost of advanced hardware, the resolution of infrared images is usually low, which is not capable of meeting the high quality requirement of various applications. Compressed sensing capturing and represents compressible signals at a sample rate significantly below the Nyquist rate, is considered as a new framework for signal reconstruction based on the sparsity and compressibility. Thus, the compressed sensing theory enlightens a computational way to reconstruct a high resolution image on the basis of a sparse signal, i.e. the low resolution image. The proposed method use low resolution and high resolution infrared pedestrian images to train an over-complete dictionary through K-SVD algorithm, by which the pedestrian are sparsely well-represented. Two distant infrared cameras in the same scene are used to capture high and low resolution image to make sure same pedestrian pair is sparsely represented under the over-complete dictionary. Therefore the similarities are learning between input low resolution image patches and high resolution image patches. The popular greedy algorithm Orthogonal Matching Pursuit (OMP) is utilized for sparse reconstruction, providing optimal performance and guaranteeing less computational cost and storage. We evaluate the quality of reconstructed image employing root mean square error and peak signal to noise. The experimental results show that the reconstructed images preserve wealthy detailed information of pedestrian, and have low RMSE and high PSNR, which are superior to the traditional super-resolution methodologies.
This paper proposed a fast human action recognition algorithm which utilized two features that can be described as iconic posture and fast moving. At first, a human detection algorithm is used to detect human object in every frame. Then regions marked as human are sent into a trained deep classification network to match trained iconic postures in key frame. Then several frames before key frame and after key frame are examined by frame differences, which are used to compensate background movement and perform human motion speed judgment. After the key frame pinning and speed judgment, the final recognition results are determined.
Infrared spectral imaging has been used in many fields, such as gas identification, environmental monitoring and target detection. In practical application, it is difficult to classify the spectrum between target and background due to cluster background and instrument noise. This article introduces the design of a modular FTIR imaging spectrometer based on interference optics and accurate control module. Based on this instrument, a spectral feature analysis and gas identification method is proposed and verified via experiment. The exact steps and algorithms include radiometric calibration, spectral pre-process, and spectral matching. First, multiple-points linear radiometric calibration is indicated to improve the calibration accuracy. Secondly, the spectral pre-processing methods are realized to decrease the noise and enhance the spectral difference between target and background. Thirdly, spectral matching based on similarity calculation is introduced to realize gas identification. Three methods, Euclidean distance (ED), spectral angle mapping (SAM) and spectral information divergence (SID), are derived. Finally, an experimental test is designed to verify the method proposed in this article, where SF6 is taken as the target. According to the results, various algorithms have different performance in time consumption and accuracy, and the proposed method is verified to be reliable and accurate in practical field test.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.