多模态目标融合

杨坤; 李宁

0
0
浏览
下载

摘要
关键词
基金信息
论文图表
动态公开评议
相关论文
评论

多模态目标融合

首发时间：2024-03-29

杨坤 ¹
杨坤（1999-），男，硕士研究生，多模态融合目标检测
李宁 ¹
李宁（1967-），女，副教授，硕导，主要研究方向为图像信息处理，作为主持人和主研人承担了

1、北京邮电大学电子工程学院，北京，100876

摘要：传统目标检测算法和常规卷积神经网络的目标检测框架仅采用单一模态目标成像图进行检测，很难适应复杂的环境和不好的光照条件，检测精度低。为了解决此问题提出了一种基于Transformer构建的多模态目标检测网络，将纹理的可见光图像与热红外图像进行融合，两者模态互补来重构目标图像，并且提升检测的精度。为了能够充分的提取不同模态目标的特征，设计了一种基于Tranformer架构的变种特征提取网络模块并且使用yolov5网络来对融合结果进行目标检测分析。该模块能够自适应的的提取不同模态的互补特征信息并用于输入特征融合重构网络得到结果。实验表明，所提融合网络能提升融合图像的质量，从而提升检测结果的精度。

关键词： Transfomer 多模态目标检测特征提取网络红外与可见光图像自适应特征提取。

For information in English, please click here

Multimodal target fusion

YANG Kun ¹
杨坤（1999-），男，硕士研究生，多模态融合目标检测
LI Ning ²
李宁（1967-），女，副教授，硕导，主要研究方向为图像信息处理，作为主持人和主研人承担了

1、School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, 100876
2、School of School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, 100876

Abstract：The traditional target detection algorithm and the target detection framework of conventional convolutional neural network only use a single modal target image for detection, which is difficult to adapt to complex environments and poor lighting conditions, and the detection accuracy is low. To address this issue, a multi-modal object detection network based on Transformer is proposed, which integrates visible light images of textures with thermal infrared images. The two modalities complement each other to reconstruct the target image and improve detection accuracy. In order to fully extract the features of different modal targets, a variant feature extraction network module based on the Tranformer architecture was designed, and the YOLOv5 network was used to perform object detection analysis on the fusion results. This module can adaptively extract complementary feature information from different modalities and use it for input feature fusion to reconstruct the network and obtain results. The experiment shows that the proposed fusion network can improve the quality of fused images, thereby enhancing the accuracy of detection results.

Keywords： Transfomer Multimodal object detection Feature extraction network Infrared and visible light images Adaptive feature extraction

基金：

论文图表：

引用

导出参考文献

.txt

.ris

.doc

杨坤，李宁. 多模态目标融合[EB/OL]. 北京：中国科技论文在线 [2024-03-29]. https://www.paper.edu.cn/releasepaper/content/202403-455.

No.****

动态公开评议

共计0人参与

动态评论进行中

全部评论

0/1000

论文编号	202403-455
论文题目	多模态目标融合
文献类型
收录期刊	上传封面中文期刊英文期刊期刊名称（中文）期刊名称（英文）年，卷（）上传封面中文专著英文专著书名（中文）书名（英文）出版地出版社出版年上传封面中文译著英文译著书名（中文）书名（英文）出版地出版社出版年上传封面中文论文集英文论文集编者.论文集名称（中文） [c]. 出版地出版社出版年， - 编者.论文集名称（英文） [c]. 出版地出版社出版年，- 上传封面中文文献英文文献期刊名称（中文）期刊名称（英文）日期-- 在线地址http:// 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期-- 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期--
英文作者写法：中外文作者均姓前名后，姓大写，名的第一个字母大写，姓全称写出，名可只写第一个字母，其后不加实心圆点“.”, 作者之间用逗号“，”分隔，最后为实心圆点“.”, 示例1：原姓名写法：Albert Einstein,编入参考文献时写法：Einstein A. 示例2：原姓名写法：李时珍；编入参考文献时写法：LI S Z. 示例3：YELLAND R L,JONES S C,EASTON K S,et al.