KEYWORDS: Video, Analytics, Video compression, Clouds, IP cameras, Video surveillance, Video acceleration, Video processing, Facial recognition systems
The HEVC Annotated Regions (AR) SEI message supports object tracking by carrying parameters defining rectangular bounding boxes with unique object identifiers, time-aligned within a video bitstream. An end-to-end distributed video analytics pipeline utilizing the AR SEI message within the GStreamer framework has been implemented, with an edge node and a cloud server node. At the edge, light-weight face detection is performed, and face region parameters are used to create the AR SEI message syntax within an HEVC bitstream. At the cloud server, face regions are extracted from the decoded video and age and gender classification is performed. The HEVC bitstream is updated to include additional metadata in the AR SEI message.
Immersive video enables end users to experience video in a more natural way interactively with viewer motion from any position and orientation within a supported viewing space. MPEG Immersive Video (MIV) is an upcoming standard being developed to handle the compression and delivery of immersive media content. It extracts only needed information in the form of patches from a collection of cameras capturing the scene and compresses with video codecs such that the scene can be reconstructed at the decoder side from any pose. A MIV bitstream is composed of non-video components carrying view parameters and patch information in addition to multiple video data sub-bitstreams carrying texture and geometry information. In this paper, we describe a simplified MIV carriage method, using an SEI message within a single layer HEVC bitstream, to take advantage of existing video streaming infrastructure, including legacy video servers. The Freeport player is built on the open-source VLC video player, a GPU DirectX implementation of a MIV renderer, and a face tracking tool for viewer motion. A prerecorded demonstration of Freeport player is provided.
Omnidirectional (or "360 degree") video, representing a panoramic view of a spherical 360° ×180° scene, can be encoded using conventional video compression standards, once it has been projection mapped to a 2D rectangular format. Equirectangular projection format is currently used for mapping 360 degree video to a rectangular representation for coding using HEVC/JEM. However, video in the top and bottom regions of the image, corresponding to the "north pole" and "south pole" of the spherical representation, is significantly warped. We propose to perform spherical rotation of the input video prior to HEVC/JEM encoding in order to improve the coding efficiency, and to signal parameters in a supplemental enhancement information (SEI) message that describe the inverse rotation process recommended to be applied following HEVC/JEM decoding, prior to display. Experiment results show that up to 17.8% bitrate gain (using the WS-PSNR end-to-end metric) can be achieved for the Chairlift sequence using HM16.15 and 11.9% gain using JEM6.0, and an average gain of 2.9% for HM16.15 and 2.2% for JEM6.0.
It is highly desirable for many broadcast video applications to be able to provide support for many diverse user devices, such as devices supporting different resolutions, without incurring the bitrate penalty of simulcast encoding. On the other hand, video decoding is a very complex operation, while the complexity is very dependent on the resolution of the coded video. Low power portable devices typically have very strict complexity restrictions and reduced-resolution displays. For such environments total bitrate efficiency of combined layers is an important requirement, but the bitrate efficiency of a lower layer individually, although desired, is not a requirement. In this paper, we propose a complexity constrained scalable system, based on the Reduced Resolution Update mode that enables low decoding complexity, while achieving better Rate-Distortion performance than an equivalent simulcast based system. Our system is targeted on broadcast environment with some terminals having very limited computational and power resources.
The new H.264 video coding standard supports picture and macroblock level adaptive frame/field coding, which can improve coding efficiency when coding interlaced sequences. A good design of an encoder needs to support all these modes and be able to decide which one is the most appropriate mode for encoding a macroblock or picture. It could be argued that the optimal solution can be found by employing a multi-pass strategy, that is to encode a macroblock or picture using all possible coding modes, and by selecting the one that yields the best coding performance. Unfortunately the computational complexity of such a multi-pass encoder is relatively high. In this paper, we propose a novel single-pass algorithm based on motion activity detection. The proposed scheme is performed in a pre-analysis stage and can reduce complexity by approximately 40%-60% compared to the two-pass frame/field encoder, while maintaining similar coding efficiency.
In this paper, an H.264 encoder is proposed which incorporates a noise pre-filter with little additional complexity. The motion estimation process in the H.264 encoder, applied to multiple reference pictures, is re-used for temporal noise filtering. Significant objective and subjective improvement is observed for the proposed system versus a standalone H.264 encoder, with greater improvement at higher bitrates. For sequences with artificially generated noise, average PSNR improvements of 0.46 to 1.96 dB were obtained. The optimum number and type of pictures to use in the temporal noise filter was also studied. The proposed single stage encoder with noise filter system provides only a slight reduction in performance as compared to a more computationally complex two-stage system with separate noise filter and encoder.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.