Paper
13 January 2003 Text extraction via an edge-bounded averaging and a parametric character model
Jian Fan
Author Affiliations +
Proceedings Volume 5010, Document Recognition and Retrieval X; (2003) https://doi.org/10.1117/12.472839
Event: Electronic Imaging 2003, 2003, Santa Clara, CA, United States
Abstract
We present a deterministic text extraction algorithm that relies on three basic assumptions: color/luminance uniformity of the interior region, closed boundaries of sharp edges and the consistency of local contrast. The algorithm is basically independent of the character alphabet, text layout, font size and orientation. The heart of this algorithm is an edge-bounded averaging for the classification of smooth regions that enhances robustness against noise without sacrificing boundary accuracy. We have also developed a verification process to clean up the residue of incoherent segmentation. Our framework provides a symmetric treatment for both regular and inverse text. We have proposed three heuristics for identifying the type of text from a cluster consisting of two types of pixel aggregates. Finally, we have demonstrated the advantages of the proposed algorithm over adaptive thresholding and block-based clustering methods in terms of boundary accuracy, segmentation coherency, and capability to identify inverse text and separate characters from background patches.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jian Fan "Text extraction via an edge-bounded averaging and a parametric character model", Proc. SPIE 5010, Document Recognition and Retrieval X, (13 January 2003); https://doi.org/10.1117/12.472839
Lens.org Logo
CITATIONS
Cited by 7 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Image processing algorithms and systems

Algorithm development

Image compression

Optical character recognition

Raster graphics

Sensors

RELATED CONTENT


Back to Top