ITIR-Net: single-view 3D reconstruction by fusing features of image texture and implicit representation

Quan Zhang; Chongmin Wang; Liang Yang; Guoqing Gao; Chang Duan; Yangyi Liu

doi:10.1117/1.JEI.33.6.063020

19 November 2024 ITIR-Net: single-view 3D reconstruction by fusing features of image texture and implicit representation

Quan Zhang, Chongmin Wang, Liang Yang, Guoqing Gao, Chang Duan, Yangyi Liu

Author Affiliations +

Journal of Electronic Imaging, Vol. 33, Issue 6, 063020 (November 2024). https://doi.org/10.1117/1.JEI.33.6.063020

Abstract

The accurate reconstruction of topology and texture details of three-dimensional (3D) objects from a single two-dimensional image presents a significant challenge in the field of computer vision. Existing methods have achieved varying degrees of success by utilizing different geometric representations, but they all suffer from limitations when accurately reconstructing surfaces with complex topology and texture. Therefore, this study proposes an approach that combines the convolutional block attention module (CBAM), texture detail fusion, and multimodal fusion to address this challenge effectively. To enhance the model’s focus on important areas within images, we integrate the CBAM mechanism with ResNet for feature extraction. Texture detail fusion plays a crucial role as it effectively captures changes in the object’s surface while multimodal fusion improves the accuracy of predicting the signed distance function. We have developed an implicit single-view 3D reconstruction network capable of retrieving topology and surface details of 3D models from a single input image. The integration of global, local, and surface texture features is a significant advancement that improves shape representation and accurately captures surface textures, filling a crucial gap in the field. During the process of reconstruction, we extract features that represent global information, local information, and texture variation information from the input image. By utilizing global information to approximate the shape of the object, refining shape and surface texture details through the utilization of local information, and applying distinct loss terms to constrain various aspects of reconstruction, our method achieves accurate single-image 3D model reconstruction with detailed surface textures. Through qualitative and quantitative analysis, we demonstrate the superiority of our model over state-of-the-art techniques on the ShapeNet dataset. The significance of our work lies in its ability to enhance the quality of single-view implicit 3D reconstruction by effectively integrating these features, leading to a more robust and detailed reconstruction of 3D models from single images. The source code of this work is available online at https://github.com/YangPeppa/ITIR-Net.

Citation Download Citation

Quan Zhang, Chongmin Wang, Liang Yang, Guoqing Gao, Chang Duan, and Yangyi Liu "ITIR-Net: single-view 3D reconstruction by fusing features of image texture and implicit representation," Journal of Electronic Imaging 33(6), 063020 (19 November 2024). https://doi.org/10.1117/1.JEI.33.6.063020

Received: 16 July 2024; Accepted: 18 October 2024; Published: 19 November 2024

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

;

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE