删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

一种改进的卷积神经网络的室内深度估计方法

本站小编 Free考研考试/2022-01-16

梁 煜,张金铭,张 为
AuthorsHTML:梁 煜,张金铭,张 为
AuthorsListE:Liang Yu,Zhang Jinming,Zhang Wei
AuthorsHTMLE:Liang Yu,Zhang Jinming,Zhang Wei
Unit:天津大学微电子学院,天津 300072
Unit_EngLish:School of Microelectronics,Tianjin University,Tianjin 300072,China
Abstract_Chinese:针对单幅图像的室内深度估计缺少显著局部或全局特征问题,提出了一种基于多种网络(全卷积网络分别与通道注意力网络、残差网络结合)构成的编码器解码器结构.该网络采用端到端的学习框架.首先使用全卷积网络与通道注意力网络结合的全卷积通道注意力网络模块作为编码器,通过信道信息获取全局感受野,提高特征图精度,并适当地将全连接层改为卷积层以达到减少网络参数的目的.然后将全卷积网络与残差网络结合构成的上采样模块作为解码器,利用ResNet的特点——跳层连接,将解码器网络加深,提高深度图的精度,将卷积网络与残差网络结合,实现端对端,并减少网络运行所用时间.最后,使用L1损失函数优化模型.在公开数据集NYU Depth v2的测试下,实验结果表明,和现有的其他单目深度估计方法相比,本文所提出的网络模型不仅精简了繁琐的精化粗图的过程,而且所预测的深度图精度更高,阈值精度的提升不少于0.5%,运行网络结构的平均用时21ms,为实现实时性奠定了基础,具有一定的理论研究价值和实际应用价值.
Abstract_English:There exists a general lack of significant local or global features for the indoor depth estimation of a single image. To address this,an encoder-decoder structure based on multiple networks(full convolutional networks (FCN),SENet and ResNet)was proposed. This network adopted an end-to-end learning framework to construct the model. First,the fully convolutional squeeze-and-excitation net(FCSE_block)module,consisting of the fully con-volutional networks and SENet,was used as the encoder. The global receptive field was obtained by channel informa-tion to improve accuracy of the feature map,and the fully connected layers were replaced by the convolutional layers to reduce the network parameters. Then the up-sampling module composed of fully convolutional networks and Res-Net was used as the decoder. The decoder network was deepened,and accuracy of the depth map was improved using ResNet’s characteristic,skip-connection. The convolutional network and ResNet were combined to realize an end-to-end learning framework. Finally,the L1 loss function was used to optimize the proposed network architecture. Under the test of the open data set NYU Depth v2,the experimental results showed that,compared with other existing mo-nocular depth estimation methods,the proposed network model not only simplified the tedious process of refinement of rough maps,but also had higher accuracy in predicting depth maps. The improvement in threshold accuracy was not less than 0.5%. Moreover,the average running time of the network structure was 21ms,which laid the founda-tion for realizing real-time performance and had certain theoretical research and practical application value.
Keyword_Chinese:机器视觉;卷积神经网络;室内深度估计;单目图像;深度学习
Keywords_English:computer vision;convolutional neural network;indoor depth estimation;monocular image;deep learning

PDF全文下载地址:http://xbzrb.tju.edu.cn/#/digest?ArticleID=6500
相关话题/卷积 神经网络