多模态目标融合
首发时间:2024-03-29
摘要:传统目标检测算法和常规卷积神经网络的目标检测框架仅采用单一模态目标成像图进行检测,很难适应复杂的环境和不好的光照条件,检测精度低。为了解决此问题提出了一种基于Transformer构建的多模态目标检测网络,将纹理的可见光图像与热红外图像进行融合,两者模态互补来重构目标图像,并且提升检测的精度。为了能够充分的提取不同模态目标的特征,设计了一种基于Tranformer架构的变种特征提取网络模块并且使用yolov5网络来对融合结果进行目标检测分析。该模块能够自适应的的提取不同模态的互补特征信息并用于输入特征融合重构网络得到结果。实验表明,所提融合网络能提升融合图像的质量,从而提升检测结果的精度。
关键词: Transfomer 多模态目标检测 特征提取网络 红外与可见光图像 自适应特征提取。
For information in English, please click here
Multimodal target fusion
Abstract:The traditional target detection algorithm and the target detection framework of conventional convolutional neural network only use a single modal target image for detection, which is difficult to adapt to complex environments and poor lighting conditions, and the detection accuracy is low. To address this issue, a multi-modal object detection network based on Transformer is proposed, which integrates visible light images of textures with thermal infrared images. The two modalities complement each other to reconstruct the target image and improve detection accuracy. In order to fully extract the features of different modal targets, a variant feature extraction network module based on the Tranformer architecture was designed, and the YOLOv5 network was used to perform object detection analysis on the fusion results. This module can adaptively extract complementary feature information from different modalities and use it for input feature fusion to reconstruct the network and obtain results. The experiment shows that the proposed fusion network can improve the quality of fused images, thereby enhancing the accuracy of detection results.
Keywords: Transfomer Multimodal object detection Feature extraction network Infrared and visible light images Adaptive feature extraction
基金:
引用
No.****
动态公开评议
共计0人参与
勘误表
多模态目标融合
评论
全部评论