KEYWORDS: 3D modeling, Visualization, 3D acquisition, Visual compression, 3D image processing, Video, Process modeling, Performance modeling, Visual process modeling, Quantization, Neural networks, Deep learning
This paper provides insight into the 3D scene compression represented by a neural implicit function. The goal of this paper is to introduce the aspects of implicit neural representation for 3D scenes such as NeRF (Neural Radiance Field) and propose a novel compression method for Neural Implicit representation for 3D scenes. We also provide the analysis of compression performance of 3D scene representation by using Neural implicit function.
KEYWORDS: Volume rendering, Video, Video compression, High efficiency video coding, Video coding, Voxels, Image fusion, Image compression, Discontinuities
The versatile video coding (VVC) [1] standard has doubled the number of intra prediction modes and MPM modes in the picture compared to the previous standard, High Efficiency Video Coding (HEVC) [2]. The most probable mode (MPM) is used to efficiently encode the intra prediction mode based on the neighboring intra-coded blocks. The VVC improves the compression performance by increasing the number of intra prediction mode and MPM candidates as the resolution of the video increases, but the texture map may be inefficient because the characteristics of the texture map are different from the general image. In this paper, we propose the efficient MPM candidate derivation on the Truncated Signed Distance Field (TSDF) [3] volume-based mesh property (texture map) for multi-view images. The proposed method shows 0.92% BD-rate performance gain for luma component in the random-access configuration [4].
KEYWORDS: Video, Video compression, Machine vision, Distortion, Video coding, Signal processing, Image compression, Visual process modeling, Networks, Image processing
We previously trained the compression network via optimization of bit-rate and distortion (feature domain MSE) [1]. In this paper, we propose feature map compression method for video coding for machine (VCM) based on deep learning-based compression network that joint training for optimizing both compressed bit rate and machine vision task performance. We use bmshij2018-hyperporior model in the CompressAI [2] as the compression network, and compress the feature map which is the output of stem layer in the Faster R-CNN X101-FPN network of Detectron2 [3]. We evaluated the proposed method by evaluation framework for MPEG VCM. The proposed method shows the better results than VVC of MPEG VCM anchor.
Joint Video Experts Team (JVET) is developing a new video coding standard beyond High Efficiency Video Coding (HEVC) named as Versatile Video Coding (VVC). In VVC, various new prediction modes have been adopted compared to HEVC and Combined Inter-Intra Prediction (CIIP) is one of them. CIIP combines inter prediction and intra prediction with derived weights to form a final prediction. In the existing CIIP, the weights are derived from the prediction modes of the two adjacent blocks of left and above for combining the final prediction, and only planar mode is used as the intra prediction mode of CIIP. In this paper, we propose methods to enhance CIIP with more accurate weights for the combination of both predictions as well as extending intra prediction modes to be combined based on the adjacent blocks’ coding modes. According to empirical observations, the below-left block and above-right block are correlated with left and above blocks in terms of prediction mode, respectively. So, the first proposed method is to derive finer weight values by using prediction modes of more adjacent blocks up to 3 blocks of left, above and above-left from the 5 adjacent blocks used for deriving regular merge candidates. The second proposed method is to use intra-coded modes of adjacent blocks of left and above blocks which are used to derive MPM candidates to be combined instead of planar mode used in the current CIIP. Experiment results show that the proposed methods slightly improve the performance of CIIP in the VVC Test Model (VTM).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.