Publication in refereed journal
香港中文大学研究人员 ( 现职)
程伯中教授 (电子工程学系) |
蒙美玲教授 (系统工程与工程管理学系) |
全文
数位物件识别号 (DOI) ○○@http://aims.cuhk.edu.hk/converis/portal/Publication/4$@○○ |
引用次数
Scopushttp://aims.cuhk.edu.hk/converis/portal/Publication/4Scopus source URL
其它资讯
摘要Cross-language spoken document retrieval (CL-SDR) is the technology that facilitates automatic retrieval of relevant information from a collection of spoken documents in a language that is different from that used in the queries. Information sources that are in different languages can then be retrieved automatically with CL-SDR, and the number of searchable information sources will increase significantly. The HMM-based retrieval model is a probabilistic formulation for the retrieval problem. Extensions to this retrieval model can be made by taking advantage of its probabilistic nature. Specifically, we have incorporated the translation component to make it possible to perform cross-language information retrieval (CLIR). In addition, this HMM-based CLIR retrieval model is also extended for retrieval at subword scales. In this work the extended HMM-based retrieval model has been applied to an English-Mandarin CL-SDR task, which is to search the Mandarin spoken document collection with English queries at word and subword scales. Retrieval results obtained from these indexing scales are then fused for multi-scale CL-SDR. Experimental results demonstrate that improvement in CL-SDR retrieval performance can be achieved by fusion of word and subword scales.
着者Lo W.-K., Meng H., Ching P.C.
期刊名称ACM Transactions on Asian Language Information Processing
出版年份2003
月份3
日期1
卷号2
期次1
出版社Association for Computing Machinary, Inc.
出版地United States
页次1 - 26
国际标準期刊号1530-0226
语言英式英语
关键词Cross-language information retrieval, Multi-scale data fusion, Spoken document retrieval