删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

双向特征融合的快速精确任意形状文本检测

本站小编 Free考研考试/2022-01-03

边亮1,,,
屈亚东2,
周宇2
1.北京航空航天大学电子信息工程学院 北京 100191
2.中国科学技术大学信息科学技术学院 合肥 230026

详细信息
作者简介:边亮:男,1982年生,博士生,研究方向为图像获取与处理
屈亚东:男,1998年生,硕士生,研究方向为场景图像文字合成、检测与识别
周宇:男,1992年生,博士生,研究方向为场景图像文字合成、检测与识别
通讯作者:边亮 askquestionbl@163.com
中图分类号:TN911.73

计量

文章访问数:542
HTML全文浏览量:185
PDF下载量:58
被引次数:0
出版历程

收稿日期:2020-10-16
修回日期:2021-01-29
网络出版日期:2021-02-05
刊出日期:2021-04-20

Bi-directional Feature Fusion for Fast and Accurate Text Detection of Arbitrary Shapes

Liang BIAN1,,,
Yadong QU2,
Yu ZHOU2
1. School of Aeronautic Science and Engineering, Beihang University, Beijing 100191, China
2. School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China


摘要
摘要:现有的基于分割的场景文本检测方法仍较难区分相邻文本区域,同时网络得到分割图后后处理阶段步骤复杂导致模型检测效率较低。为了解决此问题,该文提出一种新颖的基于全卷积网络的场景文本检测模型。首先,该文构造特征提取器对输入图像提取多尺度特征图。其次,使用双向特征融合模块融合两个平行分支特征的语义信息并促进两个分支共同优化。之后,该文通过并行地预测缩小的文本区域图和完整的文本区域图来有效地区分相邻文本。其中前者可以保证不同的文本实例之间具有区分性,而后者能有效地指导网络优化。最后,为了提升文本检测的速度,该文提出一个快速且有效的后处理算法来生成文本边界框。实验结果表明:在相关数据集上,该文所提出的方法均实现了最好的效果,且比目前最好的方法在F-measure指标上最多提升了1.0%,并且可以实现将近实时的速度,充分证明了该方法的有效性和高效性。
关键词:场景文本检测/
双向特征融合/
多尺度特征/
后处理复杂度/
任意形状文本
Abstract:Existing segmentation based methods have problems, such as the difficulty in distinguishing adjacent text areas and the low efficiency of model detection caused by the complex steps in the post-processing stage. In order to solve this problem, this article proposes a novel scene text detection model based on fully convolutional network, which can solve the problem that adjacent texts are difficult to distinguish in existing methods and improve the detection speed of the model. First, it constructs a feature extractor to extract multi-scale feature map from the input image. Secondly, the bidirectional feature fusion module is used to fuse the semantic information of the two parallel branches and promote the joint optimization of the two branches. It then effectively differentiates adjacent texts by predicting both a reduced text area map and a full text area map in parallel. The former can guarantee the distinction between different text instances, while the latter can effectively guide the network optimization. Finally, in order to improve the speed of text detection, it proposes a fast and effective post-processing algorithm to generate text boundary boxes. The experimental results show that: on relative datasets, the method proposed in this article achieves the best performance, and improves the F-measure index by 1.0% at most compared with the current best method, and can achieve near-real-time speed, which proves fully the effectiveness and high efficiency of the method.
Key words:Scene text detection/
Bi-directional feature fusion/
Multi-scale feature/
Post-processing complexity/
Arbitrary-shaped texts



PDF全文下载地址:

https://jeit.ac.cn/article/exportPdf?id=1f8d4ce9-0a94-43f0-aec7-d69f608b3da5
相关话题/图像 网络 优化 博士生 文字