In this work, we consider a problem of quadrilateral document borders detection in images captured by a mobile device’s camera. State-of-the-art algorithms for the quadrilateral document borders detection are not designed for cases when one of the document borders is either completely out of the frame, obscured, or of low contrast. We propose the algorithm which correctly processes the image in such cases. It is built on the classical contour-based algorithm. We modify the latter using the document’s aspect ratio which is known a priori. We demonstrate that this modification reduces the number of incorrect detections by 34% on an open dataset MIDV-500.
In this paper we explore possibilities of memory cost reduction without significant loss of classification accuracy in connection with the problem of the ID document type recognition on mobile devices. The studied classic approach is based on representing images using constellation of feature points and descriptors. The distortion parameters are estimated by applying RANSAC. Experimental data details the approach limitations (memory, speed and accuracy) in dependence of the descriptor type. In order to maintain accuracy when using low dimensional descriptors we suggest to modify the basic approach using additional features characteristic of the document such as straight lines and quadrangles. In addition, an early filtration of the samples and the hypotheses used in RANSAC. It was shown that the proposed modifications have a positive contribution for all types of descriptors considered. The suggested algorithm was tested using the open dataset MIDV-500. The modified approach allows to achieve an accuracy improvement and significant speed up of distortion parameters estimation in RANSAC. It was shown that using compact descriptors in conjunction with the presented method allows reduce required memory cost by more than 7 times with near-zero (0.2%) loss of accuracy, and more than 14 times with the loss of accuracy is about 18%.
KEYWORDS: Distortion, Image processing, Detection and tracking algorithms, Image classification, Mobile devices, Cameras, Data modeling, Computing systems
In this paper we explore the impact of geometrical restrictions in RANSAC sampling on the ID document type recognition accuracy in images, as well as on the accuracy of the projective distortion parameters estimation. The studied method is based on representing images as constellations of keypoints and their descriptors. The distortion parameters are estimated by applying RANSAC on the matched keypoints. Cases are studied where the base algorithm can yield erroneous or insufficiently accurate solution. A RANSAC scheme is presented with geometrical restrictors and several restriction are proposed, limiting the samples and the computed transform parameters. An experiment was conducted on the open dataset MIDV-500 and the data is presented of the dependence of classification and localization accuracy on the considered restrictors. It was shown that the introduction of restrictors allows to achieve a accuracy improvement and significant speed up.
In the paper we consider computational optimization of recognition system on Very Long Instruction Word architecture. Such architecture is aimed to a broad parallel execution and low energy consumption. We discuss VLIW features on the example of Elbrus-based computational platform. In the paper we consider system for 2D art recognition as the example. This system is able to identify a painting on acquired image as a painting from the database, using local image features constructed from YACIPE-keypoints and their RFD-based binary color descriptors, created as a concatenation of RFD-like descriptors for each channel. They are computed fast, while the 2D art database is quite large, so in our case more than a half of execution time consumes descriptor comparison using Hamming distance during image matching. This operation can be optimized with the help of low-level optimization considering special architecture features. In the paper we show efficient usage of intrinsic functions for Elbrus-4C processor and memory access with array prefetch buffer, which is specific for Elbrus platform. We demonstrate the speedup up to 11.5 times for large arrays and about 1.5 times overall speedup for the system without any changes in intermediate computations.
The paper considers the problem of 2D art identification in photos acquired with mobile devices under the conditions of museum exhibition. The proposed approach is based on a compact description of an image with a constellation of keypoints and corresponding local descriptors. A two-step comparison scheme is described for finding the best reference image matching the query. Bag-of-features approach is used as a first step, then mutual disposition of points is analyzed. Rejection of the query is performed if no suitable matches are found. Geometrical normalization of the query image is proposed to achieve higher robustness against scale and viewpoint variations. After the normalization, mutual disposition of points is estimated using a simplified geometric model. Advantages of the described approach over state-of-the-art solutions are considered. The results of the experiments conducted on the open WikiArt dataset are presented along with processing times for different hardware platforms.
The important part of the system of a planar rectangular object analysis is the localization: the estimation of projective transform from template image of an object to its photograph. The system also includes such subsystems as the selection and recognition of text fields, the usage of contexts etc. In this paper three localization algorithms are described. All algorithms use feature points and two of them also analyze near-horizontal and near- vertical lines on the photograph. The algorithms and their combinations are tested on a dataset of real document photographs. Also the method of localization quality estimation is proposed that allows configuring the localization subsystem independently of the other subsystems quality.
In this work we describe an approach to real-time image search in large databases robust to variety of query distortions such as lighting alterations, projective distortions or digital noise. The approach is based on the extraction of keypoints and their descriptors, random hierarchical clustering trees for preliminary search and RANSAC for refining search and result scoring. The algorithm is implemented in Snapscreen system which allows determining a TV-channel and a TV-show from a picture acquired with mobile device. The implementation is enhanced using preceding localization of screen region. Results for the real-world data with different modifications of the system are presented.
KEYWORDS: Video, Detection and tracking algorithms, Mobile devices, Image filtering, Machine vision, Image quality, Sensors, Current controlled current source, Patents, Internet
In this paper we consider a task of finding information fields within document with flexible form for credit card expiration date field as example. We discuss main difficulties and suggest possible solutions. In our case this task is to be solved on mobile devices therefore computational complexity has to be as low as possible. In this paper we provide results of the analysis of suggested algorithm. Error distribution of the recognition system shows that suggested algorithm solves the task with required accuracy.
In this paper we propose an algorithm for real-time rectangular document borders detection in mobile device based applications. The proposed algorithm is based on combinatorial assembly of possible quadrangle candidates from a set of line segments and projective document reconstruction using the known focal length. Fast Hough Transform is used for line detection. 1D modification of edge detector is proposed for the algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.