21 June 2022 Improved stereo matching framework with embedded multilevel attention
Bohan Li, Juan Du, James Okae
Author Affiliations +
Abstract

The recent advent of deep convolutional neural networks (CNNs) in stereo matching has led to significant improvements. However, current CNN methods still face challenges in incorporating hierarchical context information with global dependencies and lacking the discriminative ability of feature representation to resolve matching ambiguities in ill-conditioned regions. To address the aforementioned problems, we propose an improved stereo matching framework that joins a stereo backbone network and an embedded independent multilevel attention subnetwork in an end-to-end trainable pipeline. The stereo backbone network applies a residual atrous spatial pyramid pooling integrated with channelwise attention to capture richer multiscale contextual information and selectively enhance discriminative features. This is followed by unary feature concatenation to construct cost volume for disparity prediction. To further improve performance, the embedded multilevel attention subnetwork learns global coherent contextual information to generate three attention streams, which are used to boost the unary feature representations with spatial encoding, enhance the quality of cost volume, and refine the disparity map, respectively. We show that appending the proposed multilevel attention subnetwork to the stereo backbone network produces significant improvements in matching accuracy. The experimental results on Scene Flow and KITTI 2012/2015 demonstrate that our method can achieve competitive performance in stereo matching.

© 2022 SPIE and IS&T 1017-9909/2022/$28.00 © 2022 SPIE and IS&T
Bohan Li, Juan Du, and James Okae "Improved stereo matching framework with embedded multilevel attention," Journal of Electronic Imaging 31(3), 033037 (21 June 2022). https://doi.org/10.1117/1.JEI.31.3.033037
Received: 15 January 2022; Accepted: 2 June 2022; Published: 21 June 2022
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Convolution

Feature extraction

Network architectures

Computer programming

Network on a chip

Content addressable memory

Visualization

Back to Top