Paper
7 March 1996 Genetic approach to the analysis of complex text formatting
Author Affiliations +
Proceedings Volume 2660, Document Recognition III; (1996) https://doi.org/10.1117/12.234698
Event: Electronic Imaging: Science and Technology, 1996, San Jose, CA, United States
Abstract
Traditional document analysis systems often adopt a top-down framework, i.e., they are composed of various locally interacting functional components, guided by a central control mechanism. The design of each component is determined by a human expert and is optimized for a given class of inputs. Such a system can fail when confronted by an input that falls outside its anticipated domain. This paper investigates the use of a genetic-based adaptive mechanism in the analysis of complex test formatting. Specifically, we explore a genetic approach to the binarization problem. As opposed to a single, pre-defined, 'optimal' thresholding scheme, the genetic-based process applies various known methods and evaluates their effectiveness on the input image. Individual regions are treated independently, while the genetic algorithm attempts to optimize the overall result for the entire page. Advantages and disadvantages of this approach are discussed.
© (1996) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jiangying Zhou, Daniel P. Lopresti, and Jeffrey Zhou "Genetic approach to the analysis of complex text formatting", Proc. SPIE 2660, Document Recognition III, (7 March 1996); https://doi.org/10.1117/12.234698
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Genetic algorithms

Optical character recognition

Image processing

Genetics

Computer programming

Analytical research

Binary data

Back to Top