基于注意力和极几何的多视图三维重建研究
首发时间:2024-11-14
摘要:\justifying 基于深度学习的多视图三维重建(MVS)大致可分为特征提取,构建代价体,代价体正则化,代价回归四个阶段。在特征提取阶段,通常采用特征金字塔网络FPN(Feature Pyramid Network)提取参考图与源图特征,然而FPN难以提取到图片内的全局信息及图片之间的关联信息;在构建代价体阶段,通常利用方差或特征向量的内积来融合特征体以构建代价体,参与融合的相关特征向量默认被赋予相同的权重,目前仍存在非朗伯面以及弱纹理区域难以重建的问题。为此,本文设计了基于Transformer的多视图三维重建模型AE2MVSNet(Attention and Epipolar Geometry to MVSNet), 在特征提取阶段,通过自注意力机制与交叉注意力机制分别提取图片内的全局信息及图片之间的关联信息;在构建代价体阶段,应用注意力机制结合MVS原理来构建代价体。实验结果表面,在DTU数据集上及Tanks and Temples公开数据集上,AE2MVSNet在精确度,完整度,F-score等多个指标上优于既有的MVS模型。
For information in English, please click here
Multi-View Stereo Based on Attention and Epipolar geometry
Abstract:\justifying Multi-View Stereo (MVS) based on deep learning can be roughly divided into four stages: feature extraction, cost volume construction, cost volume regularization, and cost regression. In the feature extraction stage, Feature Pyramid Networks (FPN) are commonly used to extract features from reference images and sourceimages. However, FPN struggles to capture global information within images and relational information between images. In the cost volume construction stage, variance or the innercost volume, with the related feature vectors being assigned equal weights by default. Currently, there are still challenges in reconstructing non-Lambertian surfaces and weakly textured areas.To address this, this paper designs a Transformer-based multi-view stereomodel called AE2MVSNet (Attention and Epipolar Geometry to MVSNet). In the feature extraction stage, global information within images and relational information between images are extracted using self-attention and cross-attention mechanisms, respectively. In the cost volume construction stage, the attention mechanism is applied in conjunction with MVS principles to build the cost volume. Experimental results show that on the DTU dataset and the Tanks and Temples public dataset, AE2MVSNet outperformsexisting MVS models in various metrics such as accuracy, completeness, and F-score.
Keywords: multi-view stereo attention mechanism epipolar geometry deep learning
基金:
引用

No.****
动态公开评议
共计0人参与
勘误表
基于注意力和极几何的多视图三维重建研究
评论
全部评论