赵波1,
黄超1,
严由齐1
1.北京科技大学自动化学院 北京 100083
2.北京科技大学人工智能研究院 北京 100083
3.北京市工业波谱成像工程中心 北京 100083
基金项目:国家重点研发计划重点专项(2017YFB1400101-01),北京科技大学中央高校基本科研业务费专项 (FRF-BD-19-002A)
详细信息
作者简介:王粉花:女,1971年生,副教授,硕士生导师,研究方向为模式识别和智能信息处理
赵波:男,1994年生,硕士生,研究方向为计算机视觉
黄超:男,1993年生,硕士生,研究方向为计算机视觉
严由齐:男,1997年生,硕士生,研究方向为计算机视觉
通讯作者:王粉花 wangfenhua@ustb.edu.cn
中图分类号:TN911.73; TP391计量
文章访问数:1346
HTML全文浏览量:392
PDF下载量:154
被引次数:0
出版历程
收稿日期:2019-12-13
修回日期:2020-06-17
网络出版日期:2020-07-20
刊出日期:2020-12-08
Person Re-identification Based on Multi-scale Network Attention Fusion
Fenhua WANG1, 2, 3,,,Bo ZHAO1,
Chao HUANG1,
Youqi YAN1
1. School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China
2. Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing 100083, China
3. Beijing Engineering Research Center of Industrial Spectrum Imaginghe, Beijing 100083, China
Funds:The Key Projects of National Key R & D Plan (2017YFB1400101-01), Beijing University of Science and Technology Central University Basic Research Business Expenses (FRF-BD-19-002A)
摘要
摘要:行人重识别的关键依赖于行人特征的提取,卷积神经网络具有强大的特征提取以及表达能力。针对不同尺度下可以观察到不同的特征,该文提出一种基于多尺度和注意力网络融合的行人重识别方法(MSAN)。该方法通过对网络不同深度的特征进行采样,将采样的特征融合后对行人进行预测。不同深度的特征图具有不同的表达能力,使网络可以学习到行人身上更加细粒度的特征。同时将注意力模块嵌入到残差网络中,使得网络能更加关注于一些关键信息,增强网络特征学习能力。所提方法在Market1501, DukeMTMC-reID和MSMT17_V1数据集上首位准确率分别到了95.3%, 89.8%和82.2%。实验表明,该方法充分利用了网络不同深度的信息和关注的关键信息,使模型具有很强的判别能力,而且所提模型的平均准确率优于大多数先进算法。
关键词:行人重识别/
多尺度/
注意力/
残差网络/
度量学习
Abstract:The key to person re-identification depends on the extraction of pedestrian characteristics. Convolutional neural networks have powerful feature extraction and expression capabilities. In view of the fact that different features can be observed at different scales, a pedestrian re-identification method based on Multi-Scale Attention Network(MSAN) fusion is proposed. This method samples the features at different depths of the network and fuses the sampled features to predict pedestrians. Feature maps of different depths have different expressive powers, enabling the network to learn more fine-grained features of pedestrians. At the same time, the attention module is embedded in the residual network, so that the network can pay more attention to some key information and enhance the network feature learning ability. The accuracy of the proposed method on the datasets such as Market1501, DukeMTMC-reID and MSMT17_V1 reaches 95.3%, 89.8% and 82.2%, respectively. Experiments show that the method makes full use of the information of different depths of the network and the key information of interest, so that the model has strong discriminating ability, and the average accuracy of the proposed model is better than most state-of-the-art algorithms.
Key words:Person re-identification/
Multiple scale/
Attention/
Residual network/
Metric learning
PDF全文下载地址:
https://jeit.ac.cn/article/exportPdf?id=4e051153-5070-4f99-9fe1-d52c406a243c