KEYWORDS: Video coding, Super resolution, Video, High dynamic range imaging, Visualization, Tunable filters, Design and modelling, Artificial intelligence, Video processing, Scalable video coding
Super-resolution video coding describes the process of coding video at lower resolution and upsampling the result. This process is included in the AV1 standard, which ensures the same super-resolution process is employed on all receiving devices. Regrettably, the design is limited to horizontal scaling with a maximum scale factor of two. In this paper, we analyze the benefit of enabling two-dimensional upsampling with larger scale factors. Additionally, we consider the value of sending residual information to correct the super-resolution output. Results show a 6.3% and 5.6% improvement in coding efficiency for UHD SDR and UHD HDR content.
We explore the use of separate partitioning structures for luma and chroma channels in the design of next generation video codecs. The proposed methods are evaluated relative to the Quad-Tree, Ternary-Tree and Binary-Tree (QTTTBT) partitioning framework currently implemented in the BenchMark Set (BMS-1.0) software being used in the development of the Versatile Video Coding (VVC) project. VVC is the next generation video compression standard under development by the Joint Video Experts Team (JVET), which is joint collaboration between MPEG and the ITU-T. In the paper, the performance of using shared or separate partitioning tree structures for luma and chroma channels is measured for sequences including those used for the Joint Call for Proposals on video compression with capability beyond HEVC issued by MPEG/ITU-T and trends are analyzed. The use of separate partitioning tree structures is restricted to intra coded regions. Objective performance is reported using the Bjøntegaard Delta (BD) bitrate, and visual observations are also provided. To demonstrate the efficacy of using different partition structures, bitrate savings are computed using simulations and show an average improvement of 0.46%(Y)/7.83%(Cb)/7.96%(Cr) relative to state-ofthe-art. It is asserted that the coding efficiency improvement is especially pronounced in sequences with occlusions/emergence of objects or dynamic changing content (e.g. fire, water, smoke). In the tests conducted, the Campfire sequence which has a large portion of the picture exhibiting a burning fire, shows the most BD bitrate saving of 1.79%(Y)/5.45%(Cb)/1.82%(Cr).
KEYWORDS: Video coding, High dynamic range imaging, Video, Video compression, Computer programming, Televisions, Semantic video, Distortion, CRTs, RGB color model
Displays capable of showing a greater range of luminance values can render content containing high dynamic range information in a way such that the viewers have a more immersive experience. This paper introduces the design aspects of a high dynamic range (HDR) system, and examines the performance of the HDR processing chain in terms of compression efficiency. Specifically it examines the relation between recently introduced Society of Motion Picture and Television Engineers (SMPTE) ST 2084 transfer function and the High Efficiency Video Coding (HEVC) standard. SMPTE ST 2084 is designed to cover the full range of an HDR signal from 0 to 10,000 nits, however in many situations the valid signal range of actual video might be smaller than SMPTE ST 2084 supported range. The above restricted signal range results in restricted range of code values for input video data and adversely impacts compression efficiency. In this paper, we propose a code value remapping method that extends the restricted range code values into the full range code values so that the existing standards such as HEVC may better compress the video content. The paper also identifies related non-normative encoder-only changes that are required for remapping method for a fair comparison with anchor. Results are presented comparing the efficiency of the current approach versus the proposed remapping method for HM-16.2.
We present a novel technique for the problem of super-resolution of facial data. The method uses a patch-based technique, and for each low-resolution input image patch, we seek the best matching patches from a database of face images using the Coherency Sensitive Hashing technique. Coherency Sensitive Hashing relies on hashing to combine image coherence cues and image appearance cues to effectively find matching patches in images. This differs from existing methods that apply a high-pass filter on input patches to extract local features. We then perform a weighted sum of the best matching patches to get the enhanced image. We compare with state-of-the-art techniques and observe that the approach provides better performance in terms of both visual quality and reconstruction error.
The high efficiency video coding (HEVC) standard being developed by ITU-T VCEG and ISO/IEC MPEG
achieves a compression goal of reducing the bitrate by half for the same visual quality when compared with
earlier video compression standards such as H.264/AVC. It achieves this goal with the use of several new tools
such as quad-tree based partitioning of data, larger block sizes, improved intra prediction, the use of sophisticated
prediction of motion information, inclusion of an in-loop sample adaptive offset process etc. This paper describes
an approach where the HEVC framework is extended to achieve spatial scalability using a multi-loop approach.
The enhancement layer inter-predictive coding efficiency is improved by including within the decoded picture
buffer multiple up-sampled versions of the decoded base layer picture. This approach has the advantage of
achieving significant coding gains with a simple extension of the base layer tools such as inter-prediction, motion
information signaling etc. Coding efficiency of the enhancement layer is further improved using adaptive loop
filter and internal bit-depth increment. The performance of the proposed scalable video coding approach is
compared to simulcast transmission of video data using high efficiency model version 6.1 (HM-6.1). The bitrate
savings are measured using Bjontegaard Delta (BD) rate for a spatial scalability factor of 2 and 1.5 respectively
when compared with simulcast anchors. It is observed that the proposed approach provides an average luma BD
rate gains of 33.7% and 50.5% respectively.
In this paper, we consider the problem of video decoding for very high resolution image content. Our focus is on future
applications, and our emphasis is on the specific problem of entropy decoding. Here, we introduce the concept of an
"entropy slice" that partitions a bit-stream into units that can be individually entropy decoded without effecting the
reconstruction process. This allows us to parallelize the entropy decoding process of an H.264/AVC decoder with little
impact on coding performance. We compare the entropy slice technique to the standard H.264/AVC slice method to
parallelization and observe that the proposed method improves coding efficiency. Specifically, compared to the standard
slice method, results show an average bit-rate savings of 5.5%. As an additional contribution of this "entropy slice"
concept, we also propose the use of a transcoder to convert an H.264/AVC compliant bit-stream to the parallelized
entropy slice format. The transcoding operation has the desirable property of allowing highly parallel decoding of
current, standards compliant material without affecting the reconstructed image data.
KEYWORDS: Computer programming, Video, Video compression, Image compression, Video coding, Image quality, Super resolution, Matrices, Systems modeling, Image filtering
In this paper, we consider the compression of high-definition video sequences for bandwidth sensitive applications. We show that down-sampling the image sequence prior to encoding and then up-sampling the decoded frames increases compression efficiency. This is particularly true at lower bit-rates, as direct encoding of the high-definition sequence requires a large number of blocks to be signaled. We survey previous work that combines a resolution change and
compression mechanism. We then illustrate the success of our proposed approach through simulations. Both MPEG-2 and H.264 scenarios are considered. Given the benefits of the approach, we also interpret the results within the context of traditional spatial scalability.
Pre-processing algorithms improve on the performance of a video compression system by removing spurious noise and insignificant features from the original images. This increases compression efficiency and attenuates coding artifacts. Unfortunately, determining the appropriate amount of pre-filtering is a difficult problem, as it depends on both the content of an image as well as the target bit-rate of compression algorithm. In this paper, we explore a pre- processing technique that is loosely coupled to the quantization decisions of a rate control mechanism. This technique results in a pre-processing system that operates directly on the Displaced Frame Difference (DFD) and is applicable to any standard-compatible compression system. Results explore the effect of several standard filters on the DFD. An adaptive technique is then considered.
Multi-scale, multi-resolution image decompositions are efficacious for real-time target tracking applications. In these real-time systems, objects are initially located using coarse descriptions of the original image. These coarse scale results then guide and refine further inspection, with queries of higher resolution image representations restricted to regions of potential objects occurrence. The result is the classical coarse-to-fine search. In this paper, we describe a method for generating an adaptive template within the coarse-to-fine framework. Causality properties between image representations are directly exploited and lead to a template mechanism that is resilient to noise and occlusion. With minimal computational requirements, the method is well suited for real-time application.
Multi-resolution image analysis utilizes subsampled image representations for applications such as image coding, hierarchical image segmentation and fast image smoothing. An anti-aliasing filter may be used to insure that the sampled signals adequately represent the frequency components/features of the higher resolution signal. Sampling theories associated with linear anti-aliasing filtering are well-defined and conditions for nonlinear filters are emerging. This paper analyzes sampling conditions associated with anisotropic diffusion, an adaptive nonlinear filter implemented by partial differential equations (PDEs). Sampling criteria will be defined within the context of edge causality, and conditions will be prescribed that guarantee removal of all features unsupported in the sample domain. Initially, sampling definitions will utilize a simple, piecewise linear approximation of the anisotropic diffusion mechanism. Results will then demonstrate the viability of the sampling approach through the computation of reconstruction errors. Extension to more practical diffusion operators will also be considered.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.