Paper
24 January 2011 Scientific challenges underlying production document processing
Eric Saund
Author Affiliations +
Proceedings Volume 7874, Document Recognition and Retrieval XVIII; 787402 (2011) https://doi.org/10.1117/12.876948
Event: IS&T/SPIE Electronic Imaging, 2011, San Francisco Airport, California, United States
Abstract
The Field of Document Recognition is bipolar. On one end lies the excellent work of academic institutions engaging in original research on scientifically interesting topics. On the other end lies the document recognition industry which services needs for high-volume data capture for transaction and back-office applications. These realms seldom meet, yet the need is great to address technical hurdles for practical problems using modern approaches from the Document Recognition, Computer Vision, and Machine Learning disciplines. We reflect on three categories of problems we have encountered which are both scientifically challenging and of high practical value. These are Doctype Classification, Functional Role Labeling, and Document Sets. Doctype Classification asks, "What is the type of page I am looking at?" Functional Role Labeling asks, "What is the status of text and graphical elements in a model of document structure?" Document Sets asks, "How are pages and their contents related to one another?" Each of these has ad hoc engineering approaches that provide 40-80% solutions, and each of them begs for a deeply grounded formulation both to provide understanding and to attain the remaining 20-60% of practical value. The practical need is not purely technical but also depends on the user experience in application setup and configuration, and in collection and groundtruthing of sample documents. The challenge therefore extends beyond the science behind document image recognition and into user interface and user experience design.
© (2011) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Eric Saund "Scientific challenges underlying production document processing", Proc. SPIE 7874, Document Recognition and Retrieval XVIII, 787402 (24 January 2011); https://doi.org/10.1117/12.876948
Lens.org Logo
CITATIONS
Cited by 16 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image processing

Optical character recognition

Human-machine interfaces

Machine vision

Visualization

Machine learning

Classification systems

Back to Top