基于双向注意力的轻量级视频插帧方法
首发时间:2023-06-21
摘要:视频插帧是计算机视觉领域中的重要课题,通过在两个连续帧之间合成不存在的中间帧来提高视频帧率,增强视频的连续性和流畅度。现有方法在大位移运动、遮挡等复杂情形下的插帧质量较低,合成的中间帧容易出现伪影和运动模糊等现象。同时大多数的插帧方法难以兼顾模型性能和复杂度,需要通过牺牲参数量和计算量来换取更高的客观评价指标。针对上述问题,本文提出了一种基于双向注意力模块的轻量级视频插帧模型,该模块实现了通道注意力和空间注意力的有效结合,能帮助模型更好地捕获输入帧的动态区域,并提升与运动信息有关的特征表达。同时在模型中引入了轻量级的Ghost模块,能以更低的计算代价合成图像特征,在不明显降低模型原有性能的条件下,有效减少网络的内存和计算资源消耗。在Middlebury数据集上的对比实验表明,本文所提出的模型将插帧结果的峰值信噪比提高了0.06dB,同时参数量只有近似原来的1/10,充分验证了本方法的有效性。
For information in English, please click here
Lightweight Video Frame Interpolation Based on Bidirectional Attention
Abstract:\justifying Video frame interpolation is an important topic in the field of computer vision. It aims to improve the frame rate of a video by synthesizing non-existent intermediate frames between two consecutive frames, thus enhancing the continuity and smoothness of the video. Existing algorithms have difficulty in synthesizing high quality interpolation frames under scenarios such as occlusion and motion blurs. Meanwhile, most interpolation methods are unable to balance the performance and complexity of the model.The number of parameter and computation is enormous in exchange for higher objective evaluation metrics. To this end, this paper proposes a lightweight network combined with bidirectional attention module that effectively combining channel attention and spatial attention, making the model better capture variable region between the input frames and enhancing features representation related to motion. Futhernore, a lightweight Ghost module is introduced to the model which can synthesize features at a lower cost, decreasing the comsuption of memory and computation resource. Experimental results on Middlebury dataset show that the peak-signal to noise ratio is inproved by 0.06dB, and the number of parameter is only about one tenth of the original, verifing the effectiveness of our method.
Keywords: Video Frame Interpolation Deep Learning Attention Mechanism Model Compression
基金:
引用

No.****
动态公开评议
共计0人参与
勘误表
基于双向注意力的轻量级视频插帧方法
评论
全部评论