面向多核虚拟机的高效瞬态协同调度算法

删除或更新信息，请邮件至freekaoyan#163.com(#换成@)

清华大学辅仁网/2017-07-07

面向多核虚拟机的高效瞬态协同调度算法

张磊,张知皦,陈渝(

)

Efficient transitory co-scheduling for MP virtual machines

Lei ZHANG,Zhijiao ZHANG,Yu CHEN(

)

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

摘要:

HTML
输出: BibTeX | EndNote (RIS)

摘要随着多核硬件技术的迅速发展和应用对计算能力需求的不断增强,多核虚拟机应用也越来越广泛。但多核虚拟机会引发锁占用的可扩展性问题,锁占用严重影响系统的整体性能。本文基于Linux的完全公平调度器(CFS)设计并实现了一个高效的瞬态协同调度算法,能够高效地解决锁占用问题并获得更好的系统性能。实验结果表明,相比Linux 2.6.38内核,该算法可以显著地提高系统性能,在SysBench.OLTP的测试用例中系统整体性能最多提高到3.41倍,并且对调度公平性几乎没有影响。

关键词 ：虚拟机,调度,多核,锁占用

Abstract：Multiprocessor (MP) virtual machines (VMs) are widely used in cloud environments with the development of MP hardware and the demands for greater computing power. However, MP VMs suffer from the lock holder preemption (LHP) issue, which causes significant system performance degradation. This paper describes an efficient transitory co-scheduling algorithm based on the Linux CFS scheduler that effectively bypasses the guest spin lock loop to achieve better system performance. Tests show that this method significantly improves system performance (up to 3.41 fold performance advantage over the original Linux kernel 2.6.38 with SysBench.OLTP 4-VM case), while at the same time improving system latency with little to no effect on scheduling fairness.

Key words：virtual machine scheduling multicore lock holder preemption

收稿日期: 2013-01-11 出版日期: 2015-04-17

基金资助:国家自然科学基金面上项目 (61170050);核高基重大专项基金 (2012ZX01039004)

引用本文:

张磊,张知皦,陈渝. 面向多核虚拟机的高效瞬态协同调度算法[J]. 清华大学学报（自然科学版）, 2014, 54(4): 495-501.
Lei ZHANG,Zhijiao ZHANG,Yu CHEN. Efficient transitory co-scheduling for MP virtual machines. Journal of Tsinghua University(Science and Technology), 2014, 54(4): 495-501.

链接本文:

http://jst.tsinghuajournals.com/CN/或 http://jst.tsinghuajournals.com/CN/Y2014/V54/I4/495

图表:

锁占用解析

基本瞬态协同调度

选择性瞬态协同调度

延迟瞬态协同调度

公平性测试

性能测试

系统延迟测试

可扩展性测试

参考文献:

[1]	Karlin A, Li K, Manasse M, et al.Empirical studies of competitive spinning for a shared-memory multiprocessor [J]. ACM SIGOPS Operating Systems Review, 1991, 25(5): 41-55.
[2]	Zahorjan J, Lazowska D, Eager D. The effect of scheduling discipline on spin overhead in shared memory parallel Systems[J].IEEE Transactions on Parallel and Distributed Systems, 1991, 2(2): 180-198.
[3]	Ousterhout J. Scheduling techniques for concurrent systems [C]// Proceedings of the 3rd International Conference on Distributed Computing Systems. Florida, USA: IEEE, 1982: 22-30.
[4]	Strazdins P, Uhlmann J. A comparison of local and gang scheduling on a beowulf cluster [C]// Proceedings of the 2004 IEEE International Conference on Cluster Computing. Washington, DC: IEEE, 2004: 55-72.
[5]	Feitelson D G, Rudolph L. Gang scheduling performance benefits for fine-grain synchronization[J]. Journal of Parallel and Distributed Computing, 1992, 16(4): 306-318.
[6]	Wiseman Y, Feitelson D. Paired gang scheduling[J]. IEEE Transactions on Parallel and Distributed Systems, 2003, 14(6): 581-592.
[7]	Lee W, Frank M, Lee V, et al.Implications of I/O for gang scheduled workloads [C]// Job Scheduling Strategies for Parallel Processing. Berlin Heidelberg: Springer, 1997: 215-237.
[8]	Uhlig V, Levasseur J, Skoglund E, et al.Towards scalable multiprocessor virtual machines [C]// Virtual Machine Research and Technology Symposium. San Jose, California: USENIX, 2004: 43-56.
[9]	Johnson F, Stoica R, Alilamaki A, et al.Decoupling contention management from scheduling[J]. ACM SIGARCH Computer Architecture News, 2010, 38(1): 117-128.
[10]	Sukwong O, Kim H S. Is co-scheduling too expensive for SMP VMs [C]// Proceedings of the 6th conference on Computer systems. Salzburg, Austria: ACM, 2011: 257-272.
[11]	Intel. Intelâ 64 and IA-32 architectures software developer manuals [Z/OL]. [2013-12-15]. http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html.
[12]	AMD. Developer guides and manuals [Z/OL]. [2013-12-15]. http://developer.amd.com/resources/documentation-articles/developer-guides-manuals/.
[13]	DONG Yaozu, ZHENG Xudong, ZHANG Xiantao, et al.Improving virtualization performance and scalability with advanced hardware accelerations [C]// Proceedings of 2011 IEEE International Symposium on Workload Characterization. Austin, TX, USA: IEEE, 2010: 1-10.
[14]	Molnar I. CFS design [Z/OL]. [2013-12-15]. http://people.redhat.com/mingo/cfs-scheduler/sched-design-CFS.txt.
[15]	ZHANG Yanmin. HackBench [Z/OL]. [2013-12-15]. http://people.redhat.com/mingo/cfsscheduler/tools/hackbench.c.
[16]	Taylor M. SysBench [Z/OL]. [2013-12-15]. http://sysbench.sourceforge.net.

[1]	刘圣卓, 姜进磊, 杨广文. 基于副本的跨数据中心虚拟机快速迁移算法[J]. 清华大学学报（自然科学版）, 2015, 55(5): 579-584.
[2]	王晶, 王书宁. 单电梯紧急疏散调度问题求解[J]. 清华大学学报（自然科学版）, 2015, 55(5): 550-557.
[3]	林鹏, 晏坚, 费立刚, 寇保华, 刘华峰, 陆建华. 中继卫星系统的多星多天线动态调度方法[J]. 清华大学学报（自然科学版）, 2015, 55(5): 491-496,502.