删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

基于多尺度特征融合与反复注意力机制的细粒度图像分类算法

本站小编 Free考研考试/2022-01-16

何 凯,冯 旭,高圣楠,马希涛
AuthorsHTML:何 凯,冯 旭,高圣楠,马希涛
AuthorsListE:He Kai,Feng Xu,Gao Shengnan,Ma Xitao
AuthorsHTMLE:He Kai,Feng Xu,Gao Shengnan,Ma Xitao
Unit:天津大学电气自动化与信息工程学院,天津 300072
Unit_EngLish:School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China
Abstract_Chinese:细粒度图像分类是对某一类别下的图像子类进行精确划分.细粒度图像分类以其特征相似、姿态各异、背景干扰等特点,一直是计算机视觉和模式识别领域的研究热点和难点,具有重要的研究价值.细粒度图像分类的关键在于如何实现对图像判别性区域的精确提取,已有的基于神经网络算法在精细特征提取方面仍有不足.为解决这一问题,本文提出了一种多尺度反复注意力机制下的细粒度图像分类算法.考虑到高、低层级的特征分别具有丰富的语义、纹理信息,分别将注意力机制嵌入到不同尺度当中,以获取更加丰富的特征信息.此外,对输入特征图先后采取通道和空间注意,该过程可以看作是对特征矩阵的反复注意力(re-attention);最后以残差的方式,将注意力结果与原始输入特征相结合,将不同尺度特征图的注意结果拼接起来送入全连接层,以更加精确地提取显著性特征.在国际上公开的细粒度数据集(CUB-200-2011、FGVC Aircraft和Stanford Cars)上进行实验仿真,分类准确率分别达到86.16%、92.26%和93.40%;与只使用ResNet50结构相比,分别提高了1.66%、1.46%和1.10%;明显高于现有经典算法,也高于人类表现,验证了本文算法的有效性.
Abstract_English:Fine-grained image classification aims to precisely classify an image subclass under a certain category. Hence,it has become a commonand difficult point in the field of computer vision and pattern recognition and has important research value due to its similar features,different gestures,and background interference. The key issue in fine-grained image classification is how to extract precise features from the discriminative region of an image. Existing algorithms based on neural networks are still insufficient in fine feature extraction. Accordingly,a fine-grained image classification algorithm using multi-scale re-attention mechanism is proposed in this study. Considering that high- and low-level features have rich semantic and texture information,respectively,attention mechanism is embedded in different scales to obtain rich feature information. In addition,an input feature map is processed with both channel and spatial attention,which can be regarded as the re-attention of a feature matrix. Finally,using the residual form to combine the attention results and original input feature maps,the attention results on the feature maps of different scales are concatenated and fed into the full connection layer. Thus,accurately extracting salient features is helpful. Accuracy rates of 86.16%,92.26%,and 93.40% are obtained on the international public fine-grained datasets(CUB-200-2011,FGVC Aircraft,and Stanford Cars). Compared with ResNet50,the accuracy rate is increased by 1.66%,1.46%,and 1.10%,respectively. It is obviously higher than that of existing classical algorithms and human performance,which demonstrate the effectiveness of the proposed algorithm.
Keyword_Chinese:细粒度图像分类;多尺度特征融合;反复注意力机制;ResNet50
Keywords_English:fine-grained image classification;multi-scale feature fusion;re-attention mechanism;ResNet50

PDF全文下载地址:http://xbzrb.tju.edu.cn/#/digest?ArticleID=6532
相关话题/图像 尺度