KEYWORDS: Virtual reality, Structured light, 3D modeling, Point clouds, Data acquisition, Cameras, Infrared radiation, Calibration, Sensors, Data processing
In this paper, we propose a real-time, high-precision virtual meeting system based on infrared structured light 3D reconstruction principles. The system uses Infrared Structured Light 3D Reconstruction to enable immersive interaction for users within a virtual environment. The focus of the research is to explore the application of structured light technology in developing an efficient VR virtual meeting system. Starting from the background of VR technology in virtual meeting applications, current challenges, and the potential advantages of structured light technology, we establish the theoretical foundation for our study. The architecture and implementation methods of the system are detailed, with particular emphasis on the application of structured light technology in spatial scanning and participant tracking. Methods for assessing system performance, including user experience and efficiency, are introduced. Performance test results highlight the significant role of structured light technology in enhancing VR meeting experiences and system efficiency, supported by quantitative analysis through charts and images. The overall performance of the system is analyzed, discussing the role of the technology, comparison with existing solutions, potential applications, and limitations. The paper concludes by summarizing the main achievements of the VR virtual meeting system, emphasizing the importance of structured light technology in creating compelling VR meeting experiences, and proposing directions for future research.
Portable optical target measurement systems have widespread applications in various fields, including mechanical manufacturing, aerospace, industrial inspection, and clinical medicine. This study aims to address the marker point matching problem in binocular measurement systems. Through image preprocessing and mathematical algorithms, we have successfully achieved robust matching of marker points with uniqueness. We first converted color images to grayscale and applied the Otsu algorithm to adaptively select a global threshold, successfully extracting marker points. Using the nearest neighbor algorithm, we simplified 28 marker points to 7. Then, we employed the k-means clustering algorithm to achieve a second fitting, obtaining marker points located on the central line of the target. With these marker points, we computed key features for describing the target pose, including the direction vector of the target’s central line and its angle with respect to the image’s x-axis. Finally, we designed different algorithm modules to successfully achieve the robust matching of marker points across 360 degrees of free postures.
The infrared images depicting the thermal radiation of objects are not affected by objective conditions such as natural environment and climate. Visible image has high spatial resolution, with a lot of details and high contrast. Infrared and visible image fusion takes the advantages of both optical bands to get a fusion image with clearly targets and rich background information. In this paper, we propose a novel deep learning based model by designing a new network structure and loss function. The network consists of an auto encoder and deep residual shrinkage modules. We introduce multiple deep residual shrinkage blocks into encoder to learn adaptive soft threshold parameters for denoising both infrared and visible images, Without affecting the complexity of the model, feature enhancement and extraction are implemented within the network to maximize the retention of practical information, and then the average fusion strategy is used to obtain the fusion features. Finally, the fused image is reconstructed by the decoder. In the design of the loss function, our loss function consists of pixel loss, structural similarity loss and gradient loss together, thus better preserving the texture details and edge information of the image. Experiments are performed on publicly available data sets. Qualitative results exhibit that the fused images obtained by our method are clearer and more natural, and in line with human visual perception. Quantitative results show that our proposed model has achieved the optimal or sub-optimal values compared to the state-of-the-art image fusion algorithms.
Both intensity and phase information of images have been the most important similarity measures in solving the general stereo matching problem. Intensity contains most of the imaging information of the scene/object, yet the phase information could reflect the local structure of images, which is more robust than the grayscale value. Plenty of work has been done in intensity-based or phase-based stereo matching methods. However, neither of them could work well enough when process images were taken under varied illuminations. A robust depth recovery method by making use of both intensity and phase information of stereo images properly is proposed. Firstly, 2D signal analysis is conducted by using the multiscale monogenic wavelet transform, from which local phase and intensity amplitude information are extracted into different scales. Secondly, disparity maps are estimated in different scales based on the intensity information. Thirdly, the optimal disparity is obtained by weighted-combining the disparity maps in different scales. The weighted coefficients are computed by making use of the phase information. Extensive experimental evaluation demonstrates the benefits of the proposed method.
In a structured light-based 3D scanning system, the overall 3D information of to-be-measured objects cannot be retrieved at one time automatically. Currently the 3D registration algorithms can be divided into the auxiliary objects-based method and the feature points-based method. The former requires extra calibration objects or positioning platforms, which limits its application in free-form 3D scanning task. The latter can be conducted automatically, however, most of them tried to recover the motion matrix from extracted 2D features, which has been proved to be inaccurate. This paper proposed an automatic and accurate full-view registration method for 3D scanning system. Instead of using the 3D information of detected feature points to estimate the coarse motion matrix, 3D points reconstructed by the 3D scanning system were utilized. Firstly, robust SIFT features were extracted from each image and corresponding matching point pairs are achieved from two adjacent left images. Secondly, re-project all of the 3D point clouds onto the image plane of each left camera and corresponding 2D image points can be obtained. Filter out correct matching points from all 2D reprojection points under the guidance of the extracted SIFT matching points. Then, the covariance method was adopted to estimate the coarse registration matrix of adjacent positions. This procedure was repeated among every adjacent viewing position of the 3D scanning system. Lastly, fast ICP algorithm was performed to conduct fine registration of multi-view point clouds. Experiments conducted on real data have verified the effectiveness and accuracy of the proposed method.
With the development of marine economy, the demand of underwater 3D imaging technology has become more insistent. Due to the absorption and scattering of water, the projection distance and the imaging range are shortened, which directly affects the applications and effectiveness of structured light techniques in underwater detection. In this paper, the imaging model of underwater camera is studied. An underwater structured light imaging system is established. A binary coding algorithm is investigated. The experimental results show that the proposed system can achieve a high accuracy measurement and has great potentials in underwater observing and engineering applications.
3D measurement of underwater targets could recover 3D morphology of objects/scenes in water, which has extensive application prospects in the fields of submarine map drawing, underwater resource exploration, and marine archaeology et al. 3D reconstruction based on stereoscopic vision is playing a more and more important role in the field of measurement due to its incomparable advantages, such as high automation, rapid accuracy and non-contact. However, its application in underwater target detection is limited by the complex underwater environment, the absorption and scattering of light in the water and so on, which will seriously affect the quality of the image collection. In this paper, a 3D reconstruction method of underwater target based on multi view stereo vision technology was studied. A 3D profilometry system which works underwater was set up. The collection of multi-view image data is completed by a single camera and a rotating device. Firstly, camera’s back projection model is used to calibrate the motion and parameters of the underwater vision system. Secondly, the underwater target is fixed on the rotating device, and a series of images under different viewpoints are collected. Then, feature detection and matching were carried out, and dense surface point clouds were generated by several steps of expansion and filtering operations. Finally, based on the generated dense point cloud, the 3D geometric mesh model of the target is obtained by using the Poisson reconstruction method. Color and texture are fused into the 3D mesh model to get the target with high fidelity.
This paper proposed to apply the Bi-dimensional Empirical Mode Decomposition (BEMD) to the dense disparity estimation problem. The BEMD is a fully data-driven method and does not need predetermined filter and wavelet functions. It is locally adaptive and has obvious advantages in analyzing non-linear and non-stationary signals. Firstly we decompose the original stereo images by 2D-sifting process of the BEMD respectively. Through this procedure, a serial of Intrinsic Mode Functions (IMFs) and a residue are achieved. The residue denotes the DC component of the signal. Secondly, subtract the residue from original image. The resulting two dimensional signals can be thought of being free of disturbing frequencies, such as noise and illumination components. Subsequently, to obtain robust local structure information of the images, the plural Riesz transformation is utilized to achieve corresponding 2D analytic signals of the images. Thirdly, extract local phase information of the analytic signals. The similarity of local phase of stereo images, instead of local intensity information, are taken as the basis of calculating matching cost, which could reveal local structure with more robustness. At last, dense disparity map is estimated based on the proposed method. The winnertakes-all (WTA) strategy is applied to compute disparity of each pixel separately. Comparative experiment is conducted to compare the performance of the method with intensity-based methods. Rather good results have been achieved.
In this paper, the applications of polarimetric imaging for rust preventing oil film detection and characterization are discussed. A three-channel polarimetric imaging system is introduced, which can obtained the degree of linear polarization images at one shoot. The experimental results show that the proposed three-channel polarimetric imaging system can identify the oil film on the steel strip quickly and effectively, which is a fast and reliable detection method.
Calibration is a critical step for the projector-camera-based structured light system (SLS). Conventional SLS calibration means usually use the calibrated camera to calibrate the projector device, and the optimization of calibration parameters is applied to minimize the two-dimensional (2-D) reprojection errors. A three-dimensional (3-D)-based method is proposed for the optimization of SLS calibration parameters. The system is first calibrated with traditional calibration methods to obtain the primary calibration parameters. Then, a reference plane with some precisely printed markers is used for the optimization of primary calibration results. Three metric error criteria are introduced to evaluate the 3-D reconstruction accuracy of the reference plane. By treating all the system parameters as a global optimization problem and using the primary calibration parameters as initial values, a nonlinear multiobjective optimization problem can be established and solved. Compared with conventional calibration methods that adopt the 2-D reprojection errors for the camera and projector separately, a global optimal calibration result can be obtained by the proposed calibration procedure. Experimental results showed that, with the optimized calibration parameters, measurement accuracy and 3-D reconstruction quality of the system can be greatly improved.
A binary shape-coded structured light method for single-shot three-dimensional reconstruction is presented. The projected structured pattern is composed with eight geometrical shapes with a coding window size of 2×2. The pattern element is designed as rhombic with embedded geometrical shapes. The pattern feature point is defined as the intersection of two adjacent rhombic shapes, and a multitemplate-based feature detector is presented for its robust detection and precise localization. Based on the extracted grid-points, a topological structure is constructed to separate the pattern elements from the obtained image. In the decoding stage, a training dataset is first established from training samples that are collected from a variety of target surfaces. Then, the deep neural network technique is applied for the classification of pattern elements. Finally, an error correction algorithm is introduced based on the epipolar and neighboring constraints to refine the decoding results. The experimental results show that the proposed method not only owns high measurement precision but also has strong robustness to surface color and texture.
The existing binary defocusing techniques have excellent performance in the measurement speed, while the measurement precision is not high. We propose a mixed binary defocusing method, which combines the respective advantage of one-dimensional modulation defocusing techniques and two-dimensional modulation defocusing techniques. The mixed binary defocusing method adopts the frequency-dependent property of these two kinds of methods to approach the sinusoidal fringe patterns. The optimized pulse width modulation technique is selected to produce high-frequency binary patterns, and the improved error diffusion dithering technique is selected to generate low-frequency patterns. Then the phase-shifting method is adopted to obtain the wrapped phase from the defocused pattern, and the absolute phase is obtained with a multiple-wavelength phase unwrapping method from these wrapped phases at different frequencies. With the standard of the root mean square error of the wrapped phase, different defocusing methods are compared in the simulation. The measurement surfaces are compared in the real object measurement. The results verify the frequency-dependent property of these two kinds of methods and the proposed method has a greater performance than any one binary defocusing technique.
KEYWORDS: 3D modeling, 3D acquisition, 3D image processing, 3D image reconstruction, Light sources and illumination, Light emitting diodes, Cameras, Reflectivity, Calibration, LED lighting
A real-time means for three-dimensional (3-D) fingerprint acquisition is presented. The system is configured with only one camera and some white light-emitting diode lamps. The reconstruction is performed based on the principle of photometric stereo. In the algorithm, a two-layer Hanrahan–Krueger model is proposed to represent the finger surface reflectance property instead of the traditional Lambert model. By the proposed lighting direction calibration and the nonuniform lighting correction methods, surface normal at each image point can be accurately estimated by solving a nonlinear optimization problem. Finally, a linear normal transformation is implemented for the enhancement of 3-D models. The experiments are implemented with real finger and palm prints, and the results are also compared with traditional means to show its feasibility and improvement in the reconstruction accuracy.
Multiple-exposure-based methods have been an effective means for high dynamic range (HDR) imaging technology.
The current methods are greatly dependent on tone mapping, and most of them are unable to accurately recover the local
details and colors of the scene. In this work, we present a novel HDR method by using multiple image cues for the image
merging process. Firstly, all the images with various exposure times are divided into some uniform sub-regions and an
exposure estimation technique is implemented to judge the well exposed one. With all the image blocks have best
exposing quality are selected, a blending function is proposed to remove the transition boundaries between these blocks.
A fidelity metric index is introduced to assess the final fusion image, and experimental results on public image libraries
are given to demonstrate its high performance.
A structured light system simplifies three-dimensional reconstruction by illuminating a specially designed pattern to the target object, thereby generating a distinct texture on it for imaging and further processing. Success of the system hinges upon what features are to be coded in the projected pattern, extracted in the captured image, and matched between the projector's display panel and the camera's image plane. The codes have to be such that they are largely preserved in the image data upon illumination from the projector, reflection from the target object, and projective distortion in the imaging process. The features also need to be reliably extracted in the image domain. In this article, a two-dimensional pseudorandom pattern consisting of rhombic color elements is proposed, and the grid points between the pattern elements are chosen as the feature points. We describe how a type classification of the grid points plus the pseudorandomness of the projected pattern can equip each grid point with a unique label that is preserved in the captured image. We also present a grid point detector that extracts the grid points without the need of segmenting the pattern elements, and that localizes the grid points in subpixel accuracy. Extensive experiments are presented to illustrate that, with the proposed pattern feature definition and feature detector, more features points in higher accuracy can be reconstructed in comparison with the existing pseudorandomly encoded structured light systems.
With the continuous effort of the electronic industry in miniaturizing device size, the task of inspecting the various electrical parts becomes increasingly difficult. For instance, solder bumps grown on wafers for direct die-to-die bonding need to have their 3D shape inspected for assuring electrical contact and preventing damage to the processing equipments or to the dies themselves in the bonding process. Yet, the inspection task is made difficult by the tiny size and the highly specular and textureless nature of the bump surfaces. In an earlier work we proposed a mechanism for reconstructing such highly specular micro-surfaces as wafer bumps. However, the mechanism is capable of recovering 3D positions only. In this paper we describe a new mechanism that recovers surface orientations as well which are as important in describing a surface. The mechanism is based upon projecting light from a point or parallel light source to the inspected surface through a specially designed binary grid. The grid consists of a number of black and transparent blocks, resembling a checker board. By shifting the grid in space a number of times in a direction not parallel to either boundary of the grid elements, and each time taking a separate image of the illuminated surface, we could determine the surface orientations of the inspected surface at points which appear in the image data as grid corners. Experimental results on real objects are shown to illustrate the effectiveness of the proposed mechanism.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.