删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

基于金字塔池化网络的道路场景深度估计方法

本站小编 Free考研考试/2022-01-03

周武杰1, 2,,,
潘婷1,
顾鹏笠1,
翟治年1
1.浙江科技学院信息与电子工程学院 ??杭州 ??310023
2.浙江大学信息与电子工程学院 ??杭州 ??310027
基金项目:国家自然科学基金(61502429),浙江省自然科学基金(LY18F0002)

详细信息
作者简介:周武杰:男,1983年生,副教授,博士,研究方向为计算机视觉与模式识别,深度学习
潘婷:女,1994年生,硕士,研究方向为计算机视觉与模式识别
顾鹏笠:男,1989年生,硕士,研究方向为计算机视觉与模式识别
翟治年:男,1977年生,讲师,博士,研究方向为深度学习
通讯作者:周武杰 wujiezhou@163.com
中图分类号:TP391.4

计量

文章访问数:2183
HTML全文浏览量:1345
PDF下载量:59
被引次数:0
出版历程

收稿日期:2018-10-12
修回日期:2019-05-21
网络出版日期:2019-05-28
刊出日期:2019-10-01

Depth Estimation of Monocular Road Images Based on Pyramid Scene Analysis Network

Wujie ZHOU1, 2,,,
Ting PAN1,
Pengli GU1,
Zhinian ZHAI1
1. School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China
2. College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China
Funds:The National Natural Science Foundation of China (61502429), The Zhejiang Provincial Natural Science foundation (LY18F020012)


摘要
摘要:针对从单目视觉图像中估计深度信息时存在的预测精度不够准确的问题,该文提出一种基于金字塔池化网络的道路场景深度估计方法。该方法利用4个残差网络块的组合提取道路场景图像特征,然后通过上采样将特征图逐渐恢复到原始图像尺寸,多个残差网络块的加入增加网络模型的深度;考虑到上采样过程中不同尺度信息的多样性,将提取特征过程中各种尺寸的特征图与上采样过程中相同尺寸的特征图进行融合,从而提高深度估计的精确度。此外,对4个残差网络块提取的高级特征采用金字塔池化网络块进行场景解析,最后将金字塔池化网络块输出的特征图恢复到原始图像尺寸并与上采样模块的输出一同输入预测层。通过在KITTI数据集上进行实验,结果表明该文所提的基于金字塔池化网络的道路场景深度估计方法优于现有的估计方法。
关键词:单目视觉/
深度估计/
神经网络/
金字塔池化网络
Abstract:Considering the problem that the prediction accuracy is not accurate enough when the depth information is recovered from the monocular vision image, a method of depth estimation of road scenes based on pyramid pooling network is proposed. Firstly, using a combination of four residual network blocks, the road scene image features are extracted, and then through the sampling, the features are gradually restored to the original image size, and the depth of the residual block is increased. Considering the diversity of information in different scales, the features with same sizes extracted from the sampling process and the feature extraction process are merged. In addition, pyramid pooling network blocks are added to the advanced features extracted by four residual network blocks for scene analysis, and the feature graph output of pyramid pooling network blocks is finally restored to the original image size and input prediction layer together with the output of the upper sampling module. Through experiments on KITTI data set, the results show that the proposed method is superior to the existing method.
Key words:Monocular vision/
Depth estimation/
Neural network/
Pyramid pooling network



PDF全文下载地址:

https://jeit.ac.cn/article/exportPdf?id=4263691f-69f3-476f-ba67-b890c49f57ab
相关话题/网络 图像 信息 计算机 视觉