中国传媒大学学报自然科学版

AttentionRanker -基于排名优化的自-互注意力机制

AttentionRanker -self-cross attention mechanism based on ranking optimization

投稿时间：

2023/8/20 0:00:00

DOI：

中文关键词：

图像匹配；注意力机制；稀疏算法

英文关键词：

image matching; attention mechanism; sparse algorithm

基金项目：

“面向智慧电视终端的沉浸式可交互全空间自由视点直播技术集成应用”广播电视和网络视听中长期科技计划项目（2022AF0300）

姓名	单位
赵艳明	中国传媒大学信息与通信工程学院
林美秀	中国传媒大学信息与通信工程学院
曾姝瑶	中国传媒大学信息与通信工程学院

点击数：341

下载数：225

中文摘要：

图像匹配是精准估计相机位姿信息的关键，近年来基于深度学习注意力机制的图像匹配研究取得了较大进展，但如何降低Transformer类图像匹配网络的高计算复杂度仍是巨大挑战。为了提高匹配网络效率，本文提出一种基于排名优化的自-互注意力机制。通过对位置编码后的一维输入特征图重塑形，采用类空间注意力机制挑选Top-m个活跃像素点的方法稀疏注意力图，成功地将点积注意力的时间复杂度从二次降为近线性。实验结果表明该方法在前向推理时耗时更短，并且能在一定程度上提升位姿估计精度。

英文摘要：

Image matching is the key to accurate camera pose estimation. In recent years, the research on image matching based on the attention mechanism of deep learning has made great progress, but it is still a great challenge to reduce the high computational complexity of Transformer-like image matching networks. In order to improve the matching network efficiency, in this paper a self-cross attention mechanism based on ranking optimization was proposed. By reshaping the one-dimensional input feature map after position encoding and using a spatial-like attention mechanism to pick Top-m active pixel points to sparse the attention map, the time complexity of dot product attention was successfully reduced from quadratic to nearly linear. Experimental results show that the method is less time consuming in forward inference and can improve the accuracy of pose estimation to a certain extent.

参考文献：