删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

A new method for extracting domain terminology

本站小编 哈尔滨工业大学/2019-10-23

A new method for extracting domain terminology

PEI Bing-zhen 1,2, CHEN Xiao-rong2, HU Yi1, LU Ru-zhan1



Author NameAffiliation

PEI Bing-zhen Dept.of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200030, China, peibzgz@163.com
College of Computer Science and Technology, Guizhou University, Guiyang 550025, China 

CHEN Xiao-rong College of Computer Science and Technology, Guizhou University, Guiyang 550025, China 

HU Yi Dept.of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200030, China, peibzgz@163.com 

LU Ru-zhan Dept.of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200030, China, peibzgz@163.com 



Abstract:

This article proposes a new general, highly efficient algorithm for extracting domain terminologies. This domain-independent algorithm with multi-layers of filters is a hybrid of statistic-oriented and rule-oriented methods. Utilizing the features of domain terminologies and the characteristics that are unique to Chinese, this algorithm extracts domain terminologies by generating multi-word unit (MWU) candidates at first and then filtering the candidates through multi-strategies. Our test results show that this algorithm is feasible and effective.

Key words:  domain terminology  multi-word unit (MWU)  automatic extract  filter

DOI:10.11916/j.issn.1005-9113.2009.02.029

Clc Number:TP391

Fund:


相关话题/A new method extracting domain