In the field of document analysis and recognition using mobile devices for capturing, and the field of object recognition in a video stream, it is important to be able to combine the information received from different frames, since the quality of text recognition depends on the effectiveness of collecting the maximal amount of information about the target object. This paper examines and compares the effectiveness of two different combination approaches, namely pre-combination of images before recognition and the combination of recognition results. The combination methods are briefly described. The quality of the combined results obtained using different methods was measured and compared on the MIDV-500 dataset. The results show that the approach with a combination of text strings recognition results is more effective in comparison with the preliminary combination of images. It can be concluded that simple image stacking with projective alignment does not allow to achieve a comparable recognition results combination quality, and thus in order to include the information about per-frame changes of the text images more sophisticated image combination algorithms need to be employed.
In the field of document analysis and recognition using mobile devices for capturing, and the field of object recognition in a video stream, an important problem is determining the time when the capturing process should be stopped. Efficient stopping influences not only the total time spent for performing recognition and data entry, but the expected accuracy of the result as well. This paper is directed on extending the stopping method based on next integrated recognition result modelling, in order for it to be used within a string result recognition model with per-character alternatives. The stopping method and notes on its extension are described, and experimental evaluation is performed on an open dataset MIDV-500. The method was compares with previously published methods based on input observations clustering. The obtained results indicate that the stopping method based on the next integrated result modelling allows to achieve higher accuracy, even when compared with the best achievable configuration of the competing methods.
This paper proposes an improvement for an existing and widely spread approach of panorama stitching for images of planar objects. The proposed method is based on projective transformations graph adjustment. Evaluation is presented on a heterogeneous dataset which contains images of Earth’s and Mars’s surfaces, images taken using a microscope, as well as handwritten and printed text documents. Quality enhancement of panorama stitching method is illustrated on this dataset and shows more than twofold reduction in the accumulated computation error of projective transformations.
In this paper we describe stitching protocol, which allows to obtain high resolution images of long length monochromatic objects with periodic structure. This protocol can be used for long length documents or human-induced objects in satellite images of uninhabitable regions like Arctic regions. The length of such objects can reach notable values, while modern camera sensors have limited resolution and are not able to provide good enough image of the whole object for further processing, e.g. using in OCR system. The idea of the proposed method is to acquire a video stream containing full object in high resolution and use image stitching. We expect the scanned object to have straight boundaries and periodic structure, which allow us to introduce regularization to the stitching problem and adapt algorithm for limited computational power of mobile and embedded CPUs. With the help of detected boundaries and structure we estimate homography between frames and use this information to reduce complexity of stitching. We demonstrate our algorithm on mobile device and show image processing speed of 2 fps on Samsung Exynos 5422 processor
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.