删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

跨层融合与多模型投票的动作识别

本站小编 Free考研考试/2022-01-03

罗会兰,,
卢飞,
严源
江西理工大学信息工程学院 ??赣州 ??341000
基金项目:国家自然科学基金(61462035, 61862031),江西省青年科学家培养项目(20153BCB23010),江西省自然科学基金(20171BAB202014)

详细信息
作者简介:罗会兰:女,1974年生,博士,教授,研究方向为机器学习和模式识别等
卢飞:男,1994年生,硕士生,研究方向为视频中的动作识别、图像语义分割等
严源:男,1991年生,硕士生,研究方向为视频中的动作识别等
通讯作者:罗会兰 luohuilan@sina.com
中图分类号:TP391.4

计量

文章访问数:1171
HTML全文浏览量:562
PDF下载量:53
被引次数:0
出版历程

收稿日期:2018-04-24
修回日期:2018-11-02
网络出版日期:2018-11-12
刊出日期:2019-03-01

Action Recognition Based on Multi-model Voting with Cross Layer Fusion

Huilan LUO,,
Fei LU,
Yuan YAN
School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
Funds:The National Natural Science Foundation of China (61462035, 61862031), The Young Scientist Training Project of Jiangxi Province (20153BCB23010), The Natural Science Foundation of Jiangxi Province (20171BAB202014)


摘要
摘要:针对动作特征在卷积神经网络模型传输时的损失问题以及网络模型过拟合的问题,该文提出一种跨层融合模型和多个模型投票的动作识别方法。在预处理阶段,借助排序池化的方法聚集视频中的运动信息,生成近似动态图像。在全连接层前设置对特征信息进行水平翻转结构,构成无融合模型。在无融合模型的基础上添加第2层的输出特征与第5层的输出特征融合结构,构造成跨层融合模型。训练时,对无融合模型和跨层融合模型两种基本模型采用3种数据划分方式以及两种生成近似动态图像顺序进行训练,得到多个不同的分类器。测试时使用多个分类器进行预测,对它们得到的结果进行投票集成,作为最终分类结果。在UCF101数据集上,提出的无融合模型和跨层融合模型的识别方法与动态图像网络模型的方法相比,识别率有较大提高;多模型投票的识别方法能有效缓解模型的过拟合现象,增加算法的鲁棒性,得到更好的平均性能。
关键词:动作识别/
跨层融合/
多模型投票/
近似动态图像/
水平翻转
Abstract:To solve the problem of the loss in the motion features during the transmission of deep convolution neural networks and the overfitting of the network model, a cross layer fusion model and a multi-model voting action recognition method are proposed. In the preprocessing stage, the motion information in a video is gathered by the rank pooling method to form approximate dynamic images. Two basic models are presented. One model with two horizontally flipping layers is called " non-fusion model”, and then a fusion structure of the second layer and the fifth layer is added to form a new model named " cross layer fusion model”. The two basic models of " non-fusion model” and " cross layer fusion model” are trained respectively on three different data partitions. The positive and negative sequences of each video are used to generate two approximate dynamic images. So many different classifiers can be obtained by training the two proposed models using different training approximate dynamic images. In testing, the final classification results can be obtained by averaging the results of all these classifiers. Compared with the dynamic image network model, the recognition rate of the non-fusion model and the cross layer fusion model is greatly improved on the UCF101 dataset. The multi-model voting method can effectively alleviate the overfitting of the model, increase the robustness of the algorithm and get better average performance.
Key words:Action recognition/
Cross layer fusion/
Multi-models voting/
Approximate dynamic image/
Horizontal flip



PDF全文下载地址:

https://jeit.ac.cn/article/exportPdf?id=8429e853-af21-42c9-9eea-b088f7dd3430
相关话题/图像 网络 信息 结构 博士