Paper
19 December 2001 Data reduction procedure for principal cast and other talking head detection
Author Affiliations +
Proceedings Volume 4676, Storage and Retrieval for Media Databases 2002; (2001) https://doi.org/10.1117/12.451089
Event: Electronic Imaging, 2002, San Jose, California, United States
Abstract
We describe a technique for reducing the data set for a technique for reducing the data set for principal cast and other taking head detection in broadcast news content using the spatial attributes of MPEG-7 Motion Activity descriptor. The fact that these descriptors are easy to extract from compressed domain and also work well when used for matching talking head sequences, motivated us to utilize them for rapidly pruning the data set for subsequent sophisticated face detection techniques. We are thus able to speed up the process of finding the principal cast from broadcast news content by reducing the number of segments on which computationally more expensive face detection and recognition is employed. We present the experimental results of two from the centroid of ground truth set and is computationally less expensive. The second clustering procedure is based on multiple templates, which are the mean feature vectors of the component Gaussians of a Gaussian Mixture Model (GMM) trained best to fit the training data. We are able to save 50% on computation measured in terms of number of rejected shots to total number of shots while missing 25% of talking head shots in the news program. We also observe that the second clustering procedure while being slightly computationally intensive allows for higher pruning factors with more accuracy.
© (2001) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ajay Divakaran and Regunathan Radhakrishnan "Data reduction procedure for principal cast and other talking head detection", Proc. SPIE 4676, Storage and Retrieval for Media Databases 2002, (19 December 2001); https://doi.org/10.1117/12.451089
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Head

Video

Facial recognition systems

Data modeling

Video compression

Distance measurement

Feature extraction

RELATED CONTENT

An objective model for audio-visual quality
Proceedings of SPIE (February 03 2014)
Video quality assessment based on deep learning
Proceedings of SPIE (December 15 2023)
Motion based situation recognition in group meetings
Proceedings of SPIE (January 28 2010)

Back to Top