Paper
17 August 2000 Automatic document processing system with learning capability
Xuhong Li, Peter A. Ng
Author Affiliations +
Abstract
This automatic document processing system proceeds from scanning a given paper-document into the system, automatic recognizing the document layout structure, classifying it as a particular document type, which is characterized in terms of attributes to form a frame template, and extracting the pertinent information from the document to form its corresponding frame instance, which is an effective digital form of the original document. The key attribute of the system is that it is a general-purpose system, which can be adapted easily to any application domains. A segmentation method based on the 'logical closeness' is proposed. A novel and natural representation of document layout structure -- Labeled Directed Weighted Graph (LDWG) and a methodology of transforming document segmentation into LDWG representation are described. To classify a given document, we compare its layout structure with the sample layout structures of various document types prestored in the knowledge base and then use logical structure to verify the initial matching from the first step. There is a weight associated with each component of the layout structure. During the learning stage, the system can adjust the weights automatically based on the human being's correction. Modified Perceptron Learning Algorithm (PLA) is applied.
© (2000) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xuhong Li and Peter A. Ng "Automatic document processing system with learning capability", Proc. SPIE 4050, Automatic Target Recognition X, (17 August 2000); https://doi.org/10.1117/12.395571
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Optical character recognition

Virtual colonoscopy

Classification systems

Computing systems

Computer science

Detection and tracking algorithms

Back to Top