删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

基于区域与深度残差网络的图像语义分割

本站小编 Free考研考试/2022-01-03

罗会兰1,,,
卢飞1,
孔繁胜2
1.江西理工大学信息工程学院 赣州 341000
2.浙江大学计算机科学与技术学院 杭州 310027
基金项目:国家自然科学基金(61862031, 61462035),江西省自然科学基金(20171BAB202014)

详细信息
作者简介:罗会兰:女,1974年生,博士,教授,研究方向为机器学习和模式识别等
卢飞:男,1994年生,硕士,研究方向为图像语义分割
孔繁胜:男,1946年生,博士生导师,教授,研究方向人工智能与知识发现等
通讯作者:罗会兰 luohuilan@sina.com
中图分类号:TP391.41

计量

文章访问数:2786
HTML全文浏览量:1369
PDF下载量:96
被引次数:0
出版历程

收稿日期:2019-01-18
修回日期:2019-04-05
网络出版日期:2019-04-22
刊出日期:2019-11-01

Image Semantic Segmentation Based on Region and Deep Residual Network

Huilan LUO1,,,
Fei LU1,
Fansheng KONG2
1. School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
2. School of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
Funds:The National Natural Science Foundation of China (61862031, 61462035), The Natural Science Foundation of Jiangxi Province (20171BAB202014)


摘要
摘要:该文提出了一种结合区域和深度残差网络的语义分割模型。基于区域的语义分割方法使用多尺度提取相互重叠的区域,可识别多种尺度的目标并得到精细的物体分割边界。基于全卷积网络的方法使用卷积神经网络(CNN)自主学习特征,可以针对逐像素分类任务进行端到端训练,但是这种方法通常会产生粗糙的分割边界。该文将两种方法的优点结合起来:首先使用区域生成网络在图像中生成候选区域,然后将图像通过带扩张卷积的深度残差网络进行特征提取得到特征图,结合候选区域以及特征图得到区域的特征,并将其映射到区域中每个像素上;最后使用全局平均池化层进行逐像素分类。该文还使用了多模型融合的方法,在相同的网络模型中设置不同的输入进行训练得到多个模型,然后在分类层进行特征融合,得到最终的分割结果。在SIFT FLOW和PASCAL Context数据集上的实验结果表明该文方法具有较高的平均准确率。
关键词:语义分割/
区域/
深度残差网络/
集成
Abstract:An image semantic segmentation model based on region and deep residual network is proposed. Region based methods use multi-scale to create overlapping regions, which can identify multi-scale objects and obtain fine object segmentation boundary. Fully convolutional methods learn features automatically by using Convolutional Neural Network (CNN) to perform end-to-end training for pixel classification tasks, but typically produce coarse segmentation boundaries. The advantages of these two methods are combined: firstly, candidate regions are generated by region generation network, and then the image is fed through the deep residual network with dilated convolution to obtain the feature map. Then the candidate regions and the feature maps are combined to get the features of the regions, and the features are mapped to each pixel in the regions. Finally, the global average pooling layer is used to classify pixels. Multiple different models are obtained by training with different sizes of candidate region inputs. When testing, the final segmentation are obtained by fusing the classification results of these models. The experimental results on SIFT FLOW and PASCAL Context datasets show that the proposed method has higher average accuracy than some state-of-the-art algorithms.
Key words:Semantic segmentation/
Region/
Deep residual network/
Ensemble



PDF全文下载地址:

https://jeit.ac.cn/article/exportPdf?id=0d39eb41-f057-4605-bab2-2283ee6fa1f7
相关话题/网络 图像 浙江大学 博士 博士生导师