删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

4R-TPUT:结构化对等网络中的高效top-k查询算法

清华大学 辅仁网/2017-07-07

4R-TPUT: 结构化对等网络中的高效top-k查询算法
方启明1,2,杨广文1()
2. 杭州电子科技大学 计算机学院, 杭州 310018
4R-TPUT: An efficient top-k query algorithm for structured peer-to-peer systems
Qiming FANG1,2,Guangwen YANG1()
1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
2. School of Computer, Hangzhou Dianzi University, Hangzhou 310018, China

摘要:
HTML
输出: BibTeX | EndNote (RIS)
摘要top-k查询要求查找出最符合需求的前k个结果,是对等网络中的重要数据处理技术。该文研究了结构化对等网络中数据在各节点上垂直划分的精确top-k查询处理,在3通信回合的三阶段阈值(TPUT)算法基础上提出了4回合阈值算法4R-TPUT。它由下界估计、剪枝和结果查找3个阶段组成,通过在TPUT的下界估计阶段增加一个通信回合来获取更多的数据信息以得到更准确的top-k下界估计和剪枝阈值,从而减少查询处理过程中的数据访问和传输量。实验表明: 4R-TPUT相比于TPUT较大幅度降低了数据传输量,减小了查询响应时间,是一种更高效的top-k查询算法。

关键词 对等网络,top-k查询,TPUT,4R-TPUT
Abstract:The Top-k query returns to users the k best match results and is an important data processing technique in peer-to-peer systems. This paper focuses on accurate top-k query processing in structured peer-to-peer systems in which the data is vertically partitioned among peers. The three communication round-trip algorithm TPUT is expanded into a 4R-TPUT threshold algorithm involving 4 round-trip communications. The algorithm has lower bound estimation, pruning and results lookup phases. TPUT introduces an additional round-trip communication in the first phase to get more information on the data to obtain a better top-k lower bound estimation and pruning threshold which can reduce data accesses and transmissions in the query processing. Tests show that 4R-TPUT greatly reduces data transmissions compared with TPUT and, thus, requires less query response time, so 4R-TPUT is a more efficient top-k query algorithm.

Key wordspeer-to-peer systemstop-k queryTPUT4R-TPUT
收稿日期: 2013-01-10 出版日期: 2015-04-17
基金资助:国家 “八六三” 高技术项目 (2010AA012400);国家自然科学基金面上项目 (61272539);浙江省自然科学基金项目 (LQ14F020013);浙江省重点科技创新团队项目(2009R50046)
引用本文:
方启明,杨广文. 4R-TPUT: 结构化对等网络中的高效top-k查询算法[J]. 清华大学学报(自然科学版), 2014, 54(4): 480-484.
Qiming FANG,Guangwen YANG. 4R-TPUT: An efficient top-k query algorithm for structured peer-to-peer systems. Journal of Tsinghua University(Science and Technology), 2014, 54(4): 480-484.
链接本文:
http://jst.tsinghuajournals.com/CN/ http://jst.tsinghuajournals.com/CN/Y2014/V54/I4/480


图表:
top-k算法 对等网络 数据划分 查询结果
DTA[2] 结构化 垂直 精确
TPUT[3] 结构化 垂直 精确
DHTop[4] 结构化 垂直 精确
KLEE[5] 结构化 垂直 近似
TJA[6] 非结构化 垂直 精确
BRANCA[7] 非结构化 水平 精确
Hose's[8] 非结构化 水平 近似
FD[9] 非结构化 水平 近似
HPJT[10] 非结构化 垂直 精确
ASAP[11] 非结构化 水平 精确
SPEERTO[12] 超级节点 水平 精确
Balke's[13] 超级节点 水平 精确
HT-p2p plus[14] 超级节点 垂直 精确


现有对等网络中的top-k查询算法及其分类
4R-TPUT算法
4R-TPUT对TPUT算法的改进
参数 取值范围 默认值
节点数N 16~64 16
数据对象个数n 1 000~1 000000 100 000
查询属性个数m 2~10 4
数据分布 Uniform、 Gauss Uniform
k 1~100 10


实验中的参数设置
4R-TPUT和TPUT在k变化时的性能对比(N=16, n=100 000, m=4)
4R-TPUT和TPUT在n变化时的性能对比(数据集为Uniform, N=16, k=10, m=4)
4R-TPUT和TPUT在m变化时的性能对比(数据集为Uniform, N=16, k=10, n=100 000)
4R-TPUT在N变化时的性能(数据集为Uniform, n=100 000, m=4)


参考文献:
[1] SUNYongjiao, YUAN Ye, WANG Guoren. Top-k query processing over uncertain data in distributed environments [J]. J World Wide Web, 2012, 15(4): 429-446.
[2] ZHANG Jiangong, Suel T. Efficient query evaluation on large textual collections in a peer-to-peer environment [C]// Proc 5th IEEE Int Conf Peer-to-Peer Computing. Konstanz, Germany: IEEE Computer Society, 2005: 225-233.
[3] CAO Pei, WANG Zhe. Efficient top-k query calculation in distributed networks [C]// Proc 23rd Annual ACM Symp Principles of Distributed Computing. Newfoundland, Canada: ACM, 2004: 206-215.
[4] Akbarinia R, Pacitti E, Valduriez P. Processing top-k queries in distributed hash tables [C]// Proc 13th European Int Conf Parallel Processing. Rennes, France: Springer, 2007: 489-502.
[5] Michel S,Triantafillou P, Weikum G. KLEE: A framework for distributed top-k query algorithms [C]// Proc Int Conf Very Large Data Bases. Trondheim, Norway: ACM, 2005: 637-648.
[6] Zeinalipour-Yazti D, Vagena Z, Kalogeraki V, et al.Finding the k highest-ranked answers in a distributed network[J]. Computer Networks, 2009, 53(9): 1431-1449.
[7] ZHAO Keping, TAO Yufei, ZHOU Shuigeng. Efficient top-k processing in large-scaled distributed environments[J]. Data and Knowledge Engineering, 2007, 63(2): 315-335.
[8] Hose K,Karnstedt M, Sattler K U, et al.Processing top-n queries in P2P-based web integration systems with probabilistic guarantees [C]// Proc 8th Int Workshop Web and Databases. Baltimore, USA: ACM, 2005: 109-114.
[9] Akbarinia R, Pacitti E, Valduriez P. Reducing network traffic in unstructured P2P systems using top-k queries[J]. Distributed and Parallel Databases, 2006, 19(2-3): 67-86.
[10] GUAN Zhitao, YAN Guangwei, HUANG Heqing. A novel top-k query scheme in unstructured p2p networks [C]// Proc 9th IEEE Int Conf Computer and Information Technology. Xiamen, China: IEEE Computer Society, 2009: 16-21.
[11] Dedzoe W K, Lamarre P, Akbarinia R, et al.ASAP top-k query processing in unstructured p2p systems [C]// Proc 10th Int Conf Peer-to-Peer Computing. Delft, Netherlands: IEEE, 2010: 1-10.
[12] Vlachou A, Doulkeridis C, Nørvåg K, et al.On efficient top-k query processing in highly distributed environments [C]// Proc 2008 ACM SIGMOD Int Conf Management of Data. Vancouver, Canada: ACM, 2008: 753-764.
[13] Balke W T, Nejdl W, Siberski W, et al.Progressive distributed top-k retrieval in peer-to-peer networks [C]// Proc 21st Int Conf Data Engineering. Tokyo, Japan: IEEE Computer Society, 2005: 174-185.
[14] Chrysakis I, Chalkidis C, Plexousakis D. Evaluation of top-k queries in peer-to-peer networks using threshold algorithms [C]// Proc 19th ACM Int Conf Information and Knowledge Management. Toronto, Canada: ACM, 2010: 1305-1308.
[15] YU Hailing, LI Huagang, WU Ping, et al.Efficient processing of distributed top-k queries [C]// Proc 16th Int Conf Database and Expert Systems Applications. Copenhagen, Denmark: Springer, 2005: 65-74.
[16] Chen B, Liang W, Yu J X. Energy-efficient top-k query evaluation and maintenance in wireless sensor networks [Z/OL].[2013-08-07]. http://link.springer.com/article/10.1007/s11276-013-0625-6.


相关文章:
[1]韩心慧,肖祥全,张建宇,刘丙双,张缘. 基于社交关系的DHT网络Sybil攻击防御[J]. 清华大学学报(自然科学版), 2014, 54(1): 1-7.

相关话题/数据 网络 实验 通信 计算机