删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

一种新的场景文本识别模型

本站小编 Free考研考试/2021-12-21

本文二维码信息
二维码(扫一下试试看!)
一种新的场景文本识别模型
A New Model for Scene Text Recognition
投稿时间:2018-03-07
DOI:10.15918/j.tbit1001-0645.2019.03.008
中文关键词:序列文本识别长短记忆网络残差网络注意力模型
English Keywords:sequence text recognitionlong short-time memorydeep residual networkattention model
基金项目:国家自然科学基金资助项目(61370195,U1536121)
作者单位E-mail
王茂森北京邮电大学 智能通信软件与多媒体北京市重点实验室, 北京 100876
蒋小森优酷信息技术(北京)有限公司 视频算法组, 北京 100080senlinuc@qq.com
牛少彰北京邮电大学 智能通信软件与多媒体北京市重点实验室, 北京 100876szniu@bupt.edu.cn
摘要点击次数:930
全文下载次数:538
中文摘要:
提出了基于残差网络和注意力机制的LRAM(LSTM with ResNet and attention model)模型,在模型中引入残差模块(ResNet),加快了网络的收敛速度,降低了网络训练难度;引入注意力机制(AM),实现了不同序列对当前文本识别的权重分配,提高文本识别的准确率.通过在Synth90K,Street View Text和ICDAR等数据集测试结果,与已存在的模型相比,LRAM性能超过现存其他网络模型.
English Summary:
A LRAM model was proposed based on LSTM with ResNet (residual network) and attention model for reading text in natural image.In this model,a ResNet was adopted to accelerate the convergence speed of the network and to reduce the difficulty of network training,and an attention mechanism (AM) was utilized to carry out weigh distribution of different sequences for current text recognition and to improve the accuracy of text recognition.Extensive experiments on various benchmarks,including the Synth90K,Street View Text and ICDAR datasets,show that the performance of LRAM model substantially outperforms the existing methods.
查看全文查看/发表评论下载PDF阅读器
相关话题/北京 网络 通信 软件 智能