基于注意力和极几何的多视图三维重建研究

李刚; 黄华

0
0
浏览
下载

摘要
关键词
基金信息
论文图表
动态公开评议
相关论文
评论

基于注意力和极几何的多视图三维重建研究

首发时间：2024-11-14

李刚 ¹
李刚（1986-），男，硕士研究生，主要研究方向：计算机视觉、多视图三维重建。
黄华 ^{1
2
3}
黄华（1977-），男，副教授，硕导，主要研究方向：计算机视觉、机器学习Email:hhua@bjtu.edu.cn

1、北京交通大学计算机科学与技术学院，北京　100044
2、交通数据分析与挖掘北京市重点实验室，北京　100044
3、轨道工程北京市重点实验室，北京　100044

摘要：\justifying 基于深度学习的多视图三维重建(MVS)大致可分为特征提取,构建代价体,代价体正则化,代价回归四个阶段。在特征提取阶段,通常采用特征金字塔网络FPN（Feature Pyramid Network）提取参考图与源图特征,然而FPN难以提取到图片内的全局信息及图片之间的关联信息；在构建代价体阶段,通常利用方差或特征向量的内积来融合特征体以构建代价体,参与融合的相关特征向量默认被赋予相同的权重,目前仍存在非朗伯面以及弱纹理区域难以重建的问题。为此,本文设计了基于Transformer的多视图三维重建模型AE2MVSNet(Attention and Epipolar Geometry to MVSNet), 在特征提取阶段,通过自注意力机制与交叉注意力机制分别提取图片内的全局信息及图片之间的关联信息；在构建代价体阶段,应用注意力机制结合MVS原理来构建代价体。实验结果表面,在DTU数据集上及Tanks and Temples公开数据集上,AE2MVSNet在精确度,完整度,F-score等多个指标上优于既有的MVS模型。

关键词：多视图三维重建注意力机制极几何深度学习

For information in English, please click here

Multi-View Stereo Based on Attention and Epipolar geometry

LI Gang ^{1
2}
李刚（1986-），男，硕士研究生，主要研究方向：计算机视觉、多视图三维重建。
HUANG Hua ^{1
2
3}
黄华（1977-），男，副教授，硕导，主要研究方向：计算机视觉、机器学习Email:hhua@bjtu.edu.cn

1、Computer Science and Technology School, Beijing Jiaotong University, Beijing 100044
2、Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044
3、Key Laboratory of Beijing for Railway Engineering, Beijing Jiaotong University, Beijing 100044

Abstract：\justifying Multi-View Stereo (MVS) based on deep learning can be roughly divided into four stages: feature extraction, cost volume construction, cost volume regularization, and cost regression. In the feature extraction stage, Feature Pyramid Networks (FPN) are commonly used to extract features from reference images and sourceimages. However, FPN struggles to capture global information within images and relational information between images. In the cost volume construction stage, variance or the innercost volume, with the related feature vectors being assigned equal weights by default. Currently, there are still challenges in reconstructing non-Lambertian surfaces and weakly textured areas.To address this, this paper designs a Transformer-based multi-view stereomodel called AE2MVSNet (Attention and Epipolar Geometry to MVSNet). In the feature extraction stage, global information within images and relational information between images are extracted using self-attention and cross-attention mechanisms, respectively. In the cost volume construction stage, the attention mechanism is applied in conjunction with MVS principles to build the cost volume. Experimental results show that on the DTU dataset and the Tanks and Temples public dataset, AE2MVSNet outperformsexisting MVS models in various metrics such as accuracy, completeness, and F-score.

Keywords： multi-view stereo attention mechanism epipolar geometry deep learning

基金：

论文图表：

引用

导出参考文献

.txt

.ris

.doc

李刚，黄华. 基于注意力和极几何的多视图三维重建研究[EB/OL]. 北京：中国科技论文在线 [2024-11-14]. https://www.paper.edu.cn/releasepaper/content/202411-25.

No.****

动态公开评议

共计0人参与

动态评论进行中

全部评论

0/1000

论文编号	202411-25
论文题目	基于注意力和极几何的多视图三维重建研究
文献类型
收录期刊	上传封面中文期刊英文期刊期刊名称（中文）期刊名称（英文）年，卷（）上传封面中文专著英文专著书名（中文）书名（英文）出版地出版社出版年上传封面中文译著英文译著书名（中文）书名（英文）出版地出版社出版年上传封面中文论文集英文论文集编者.论文集名称（中文） [c]. 出版地出版社出版年， - 编者.论文集名称（英文） [c]. 出版地出版社出版年，- 上传封面中文文献英文文献期刊名称（中文）期刊名称（英文）日期-- 在线地址http:// 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期-- 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期--
英文作者写法：中外文作者均姓前名后，姓大写，名的第一个字母大写，姓全称写出，名可只写第一个字母，其后不加实心圆点“.”, 作者之间用逗号“，”分隔，最后为实心圆点“.”, 示例1：原姓名写法：Albert Einstein,编入参考文献时写法：Einstein A. 示例2：原姓名写法：李时珍；编入参考文献时写法：LI S Z. 示例3：YELLAND R L,JONES S C,EASTON K S,et al.