李嘉辉2,,,
康守强2,
王庆岩2,
王玉静2
1.广东科学技术职业学院机器人学院 珠海 519090
2.哈尔滨理工大学电气与电子工程学院 哈尔滨 150000
基金项目:基于工业互联网的协作式智能机器人产教融合创新应用平台(2020CJPT004),黑龙江省自然科学基金(LH2019E058),智能机器人湖北省重点实验室开放基金(HBIR202004),黑龙江省普通高校基本科研业务费专项资金(LGYC2018JC027)
详细信息
作者简介:谢金宝:男,1980年生,副教授,研究方向为自然语言处理、人工智能
李嘉辉:男,1995年生,硕士生,研究方向为自然语言处理、人工智能
康守强:男,1980年生,教授,研究方向为智能诊断、人工智能
王庆岩:男,1984年生,副教授,研究方向为智能诊断、人工智能、智能图像处理
王玉静:女,1983年生,副教授,研究方向为智能诊断、人工智能
通讯作者:李嘉辉 maillijiahui@163.com
中图分类号:TP391.1计量
文章访问数:451
HTML全文浏览量:167
PDF下载量:71
被引次数:0
出版历程
收稿日期:2020-10-09
修回日期:2021-02-03
网络出版日期:2021-03-01
刊出日期:2021-08-10
A Multi-domain Text Classification Method Based on Recurrent Convolution Multi-task Learning
Jinbao XIE1,Jiahui LI2,,,
Shouqiang KANG2,
Qingyan WANG2,
Yujing WANG2
1. School of Robotic, Guangdong Polytechnic of Science and Technology, Zhuhai 519090, China
2. School of Electrical and Electronic Engineering, Harbin University of Science and Technology, Harbin 150000, China
Funds:The collaborative intelligent robot production and education integrates innovative application platform based on the industrial Internet (2020CJPT004), The Natural Science Foundation of Heilongjiang Province (LH2019E058), The open fund projects of Hubei Key Laboratory of Intelligent Robot (Wuhan Institute of Technology) (HBIR 202004), The Fundamental Research Fundation for Universities of Heilongjiang Province (LGYC2018JC027)
摘要
摘要:文本分类任务中,不同领域的文本很多表达相似,具有相关性的特点,可以解决有标签训练数据不足的问题。采用多任务学习的方法联合学习能够将不同领域的文本利用起来,提升模型的训练准确率和速度。该文提出循环卷积多任务学习(MTL-RC)模型用于文本多分类,将多个任务的文本共同建模,分别利用多任务学习、循环神经网络(RNN)和卷积神经网络(CNN)模型的优势获取多领域文本间的相关性、文本长期依赖关系、提取文本的局部特征。基于多领域文本分类数据集进行丰富的实验,该文提出的循环卷积多任务学习模型(MTL-LC)不同领域的文本分类平均准确率达到90.1%,比单任务学习模型循环卷积单任务学习模型(STL-LC)提升了6.5%,与当前热门的多任务学习模型完全共享多任务学习模型(FS-MTL)、对抗多任务学习模型(ASP-MTL)、间接交流多任务学习框架(IC-MTL)相比分别提升了5.4%, 4%和2.8%。
关键词:多领域文本分类/
多任务学习/
循环神经网络/
卷积神经网络
Abstract:In the text classification task, many texts in different domains are similarly expressed and have the characteristics of correlation, which can solve the problem of insufficient training data with labels. The text of different fields can be combined with the multi-task learning method, and the training accuracy and speed of the model can be improved. A Recurrent Convolution Multi-Task Learning (MTL-RC) model for text multi-classification is proposed, jointly modeling the text of multiple tasks, and taking advantage of multi-task learning, Recurrent Neural Network(RNN) and Convolutional Neural Network(CNN) models to obtain the correlation between multi-domain texts, long-term dependence of text. Local features of text are extracted. Rich experiments are carried out based on multi-domain text classification datasets, the Recurrent Convolution Multi-Task Learning(MTL-LC) proposed in this paper has an average accuracy of 90.1% for text classification in different fields, which is 6.5% higher than the single-task learning model STL-LC. Compared with mainstream multi-tasking learning models Full Shared Multi-Task Learning(FS-MTL), Adversarial Multi-Task Learninng(ASP-MTL), and Indirect Communciation for Multi-Task Learning(IC-MTL) have increased by 5.4%, 4%, and 2.8%, respectively.
Key words:Multi-domain text classification/
Multi-task learning/
Recurrent Neural Netword(RNN)/
Convolutional Neural Network(CNN)
PDF全文下载地址:
https://jeit.ac.cn/article/exportPdf?id=e1e47a1f-1541-43ea-835a-c0c109c74e34