Paper
24 June 2005 Automatic video caption detection and extraction in the DCT compressed domain
Chin-Fu Tsao, Yu-Hao Chen, Jin-Hau Kuo, Chia-wei Lin, Ja-Ling Wu
Author Affiliations +
Proceedings Volume 5960, Visual Communications and Image Processing 2005; 59602N (2005) https://doi.org/10.1117/12.631588
Event: Visual Communications and Image Processing 2005, 2005, Beijing, China
Abstract
The text in a video frame can help us to understand the semantics of video content directly. Although there are many approaches that can automatically detect and localize text a video, most of them use the original pixels of an image to find the text regions. In this paper, we present an approach to automatically localize captions in MPEG compressed videos. Caption regions are segmented from background by using their distinguishing texture characteristics. Unlike previously published ones which fully decompress the video sequence before extracting the caption regions or only extract text regions in Intra-(I-) frames, our approach detect and localize caption regions directly in the DCT compressed domain. Therefore, only very small amounts of decoding processes are required. Experiments show that a good caption detection rate can be obtained, and the average recalls of Intra- and Inter-frame detections are 97.77% and 97.84%, respectively.
© (2005) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Chin-Fu Tsao, Yu-Hao Chen, Jin-Hau Kuo, Chia-wei Lin, and Ja-Ling Wu "Automatic video caption detection and extraction in the DCT compressed domain", Proc. SPIE 5960, Visual Communications and Image Processing 2005, 59602N (24 June 2005); https://doi.org/10.1117/12.631588
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Video compression

Detection and tracking algorithms

Semantic video

Image compression

Binary data

Image segmentation

RELATED CONTENT

Autonomous tracking of designated persons in crowded scenes
Proceedings of SPIE (September 30 2013)
MPEG 4 very low bit rate coding for multimedia...
Proceedings of SPIE (September 16 1994)
Semantic event detection using MPEG-7
Proceedings of SPIE (January 10 2003)
Feature management for large video databases
Proceedings of SPIE (April 14 1993)
Authentication techniques for multimedia content
Proceedings of SPIE (January 22 1999)

Back to Top