删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

香港科技大学工学院老师教师导师介绍简介-Shaohuai SHI

本站小编 Free考研考试/2022-01-30

Shaohuai SHI
施少懷
PhD in Computer Science
Hong Kong Baptist University, 2020

Research Assistant Professor
Department of Computer Science and Engineering



(852) 2358 8329
shaohuais@ust.hk
Room 2538
Personal Web

Google Scholar
Wr4B6fQAAAAJ

ORCID
0000-0002-1418-5160

ResearcherID
O-5650-2018

Scopus ID
57195360813




Research Interest Publications Teaching Assignment




Research Interest
Distributed systems
Distributed computing
General purpose processing on graphics hardware
Machine learning



Publications
All Years 24 2022 0 2021 7 2020 5 2019 4 2018 4 2017 2 2016 2





2021 7

A Quantitative Survey of Communication Optimizations in Distributed Deep Learning
IEEE Network, v. 35, (3), May-June 2021, p. 230-237
Shi, Shaohuai; Tang, Zhenheng; Chu, Xiaowen; Liu, Chengjian; Wang, Wei; Li, Bo Article
MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning
IEEE Transactions on Parallel and Distributed Systems, v. 32, (8), August 2021, article number 9328614, p. 1903-1917
Shi, Shaohuai; Chu, Xiaowen; Li, Bo Article
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
Proceedings - International Conference on Distributed Computing Systems, v. 2021-July, July 2021 , p. 550-560
Shi, Shaohuai; Zhang, Lin; Li, Bo Conference paper
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans
Proceedings of the AAAI Conference on Artificial Intelligence, v. 35, (6), 2021, article number 16614, p. 4821-4829
He, Xin; Wang, Shihao; Chu, Xiaowen; Shi, Shaohuai; Tang, Jiangping; Liu, Xin; Yan, Chenggang; Zhang, Jiyong; Ding, Guiguang Conference paper
Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format
Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS, v. 2020-December, December 2020, article number 9359142, p. 19-26
Shi, Shaohuai; Wang, Qiang; Chu, Xiaowen Conference paper
Exploiting Simultaneous Communications To Accelerate Data Parallel Distributed Deep Learning
Proceedings - IEEE INFOCOM, v. 2021-May, 10 May 2021, article number 9488803
Shi, Shaohuai; Chu, Xiaowen; Li, Bo Conference paper
Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
The 4th Machine Learning and Systems Conference, MLSys 2021, Virtual, 5-9 April 2021
Shi, Shaohuai; Zhou, Xianhao; Song, Shutao; Wang, Xingyao; Zhu, Zilin; Huang, Xue; Jiang, Xinan; Zhou, Feihu; Guo, Zhenyu; Xie, Liqiang; Lan, Rui; Ouyang, Xianbin; Zhang, Yan; Wei, Jieqian; Gong, Jing; Lin, Weiliang; Gao, Ping; Meng, Peng; Xu, Xiaomin; Guo, Chenyang; Yang, Bo; Chen, Zhibo; Wu, Yongjian; Chu, Xiaowen Conference paper

2020 5

Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training
Proceedings: 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2020 / IEEE. Piscataway, NJ : IEEE, 2020, p. 744-751, Article number 9139681
Wang, Yuxin; Wang, Qiang; Shi, Shaohuai; He, Xin; Tang, Zhenheng; Zhao, Kaiyong; Chu, Xiaowen Conference paper
Communication-Efficient Decentralized Learning with Sparsification and Adaptive Peer Selection
Proceedings - International Conference on Distributed Computing Systems, v. 2020-November, November 2020, article number 9355592, p. 1207-1208, Code 167382
Tang, Zhenheng; Shi, Shaohuai; Chu, Xiaowen Conference paper
Communication-Efficient Distributed Deep Learning with Merged Gradient Sparsification on GPUs
Proceedings - IEEE INFOCOM, v. 2020-July, July 2020, article number 9155269, p. 406-415
Shi, Shaohuai; Wang, Qiang; Chu, Xiaowen; Li, Bo; Qin, Yang; Liu, Ruihao; Zhao, Xinxiao Conference paper
FADNet: A Fast and Accurate Network for Disparity Estimation
International Conference on Robotics and Automation (ICRA 2020), Paris, France (Virtual Conference), 31 May - 31 August 2020
Wang, Qiang; Shi, Shaohuai; Zheng, Shizhen; Zhao, Kaiyong; Chu, Xiaowen Conference paper
Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
24th European Conference on Artificial Intelligence (ECAI 2020), Santiago de Compostela, 29 August - 5 September 2020
Shi, Shaohuai; Tang, Zhenheng; Wang, Qiang; Zhao, Kaiyong; Chu, Xiaowen Conference paper

2019 4

A convergence analysis of distributed SGD with communication-efficient gradient sparsification
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence / Edited by Sarit Kraus. Vienna, Austria : International Joint Conferences on Artificial Intelligence, 2019, p. 3411-3417
Shi, Shaohuai; Zhao, Kaiyong; Wang, Qiang; Tang, Zhenheng; Chu, Xiaowen Conference paper
A distributed synchronous SGD algorithm with global top-k Sparsification for low bandwidth networks
Proceedings: 2019 39th IEEE International Conference on Distributed Computing Systems, ICDCS 2019 / IEEE. Piscataway, NJ : IEEE, 2019, p. 2238-2247, Article number 8884924
Shi, Shaohuai; Wang, Qiang; Zhao, Kaiyong; Tang, Zhenheng; Wang, Yuxin; Huang, Xiang; Chu, Xiaowen Conference paper
Computer-Aided Clinical Skin Disease Diagnosis Using CNN and Object Detection Models
Proceedings: 2019 IEEE International Conference on Big Data / IEEE. Piscataway, NJ : IEEE, 2019, p. 4839-4844, Article number 9006528
He, Xin; Wang, Shihao; Shi, Shaohuai; Tang, Zhenheng; Wang, Yuxin; Zhao, Zhihao; Dai, Jing; Ni, Ronghao; Zhang, Xiaofeng; Liu, Xiaoming; Wu, Zhili; Yu, Wu; Chu, Xiaowen Conference paper
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
Proceedings - IEEE INFOCOM, v. 2019-April, April 2019, article number 8737367, p. 172-180
Shi, Shaohuai; Chu, Xiaowen; Li, Bo Conference paper

2018 4

A DAG Model of Synchronous Stochastic Gradient Descent in Distributed Deep Learning
Proceedings: 2018 IEEE 24th International Conference on Parallel and Distributed Systems, ICPADS 2018 / IEEE. Piscataway, NJ : IEEE, 2018, p. 425-432, Article number 8644932
Shi, Shaohuai; Wang, Qiang; Chu, Xiaowen; Li, Bo Conference paper
Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes
32nd Annual Conference on Neural Information Processing (NIPS 2018), Montréal, Canada, 3-8 December 2018
Jia, Xianyan; Song, Shutao; He, Wei; Wang, Yangzihao; Rong, Haidong; Zhou, Feihu; Xie, Liqiang; Guo, Zhenyu; Yang, Yuanzhou; Yu, Liwei; Chen, Tiegang; Hu, Guangxiao; Shi, Shaohuai; Chu, Xiaowen Conference paper
Performance modeling and evaluation of distributed deep learning frameworks on GPUs
Proceedings: IEEE 16th International Conference on Dependable, Autonomic and Secure Computing, IEEE 16th International Conference on Pervasive Intelligence and Computing, IEEE 4th International Conference on Big Data Intelligence and Computing and IEEE 3rd Cyber Science and Technology Congress, DASC-PICom-DataCom-CyberSciTec 2018 / IEEE. Piscataway, NJ : IEEE, 2018, p. 943-948, Article number 8512002
Shi, Shaohuai; Wang, Qiang; Chu, Xiaowen Conference paper
Supervised learning based algorithm selection for deep neural networks
Proceedings: 2017 IEEE 23rd International Conference on Parallel and Distributed Systems, ICPADS 2017 / IEEE. Piscataway, NJ : IEEE, 2017, p. 344-351
Shi, Shaohuai; Xu, Pengfei; Chu, Xiaowen Conference paper

2017 2

Benchmarking state-of-the-art deep learning software tools
Proceedings: 2016 7th International Conference on Cloud Computing and Big Data, CCBD 2016 / IEEE. Piscataway, NJ : IEEE, 2016, p. 99-104, Article number 7979887
Shi, Shaohuai; Wang, Qiang; Xu, Pengfei; Chu, Xiaowen Conference paper
Performance Evaluation of Deep Learning Tools in Docker Containers
Proceedings: 2017 3rd International Conference on Big Data Computing and Communications, BigCom 2017 / IEEE. Piscataway, NJ : IEEE, 2017, p. 395-403, Article number 8113094
Xu, Pengfei; Shi, Shaohuai; Chu, Xiaowen Conference paper

2011 1

Mixed Precision Method for GPU-based FFT
Proceedings: The 14th IEEE International Conference on Computational Science and Engineering, CSE 2011 & The 11th International Symposium on Pervasive Systems, Algorithms, and Networks, I-SPAN 2011 & The 10th IEEE International Conference on Ubiquitous Computing and Communications, IUCC 2011 / IEEE. Piscataway, NJ : IEEE, 2011, p. 580-586, Article number 6062934
Qi, Shuhan; Wang, Xuan; Shi, Shaohuai Conference paper

2010 1

The GPU-based string matching system in adavanced AC algorithm
Proceedings: 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, 10th IEEE Int. Conf. Scalable Computing and Communications, ScalCom-2010 / IEEE. Piscataway, NJ : IEEE, 2010, p. 1158-1163, Article number 5577901
Peng, Jiangfeng; Chen, Hu; Shi, Shaohuai Conference paper





Article 2

A Quantitative Survey of Communication Optimizations in Distributed Deep Learning
IEEE Network, v. 35, (3), May-June 2021, p. 230-237
Shi, Shaohuai; Tang, Zhenheng; Chu, Xiaowen; Liu, Chengjian; Wang, Wei; Li, Bo
MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning
IEEE Transactions on Parallel and Distributed Systems, v. 32, (8), August 2021, article number 9328614, p. 1903-1917
Shi, Shaohuai; Chu, Xiaowen; Li, Bo

Conference paper 5

Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
Proceedings - International Conference on Distributed Computing Systems, v. 2021-July, July 2021 , p. 550-560
Shi, Shaohuai; Zhang, Lin; Li, Bo
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans
Proceedings of the AAAI Conference on Artificial Intelligence, v. 35, (6), 2021, article number 16614, p. 4821-4829
He, Xin; Wang, Shihao; Chu, Xiaowen; Shi, Shaohuai; Tang, Jiangping; Liu, Xin; Yan, Chenggang; Zhang, Jiyong; Ding, Guiguang
Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format
Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS, v. 2020-December, December 2020, article number 9359142, p. 19-26
Shi, Shaohuai; Wang, Qiang; Chu, Xiaowen
Exploiting Simultaneous Communications To Accelerate Data Parallel Distributed Deep Learning
Proceedings - IEEE INFOCOM, v. 2021-May, 10 May 2021, article number 9488803
Shi, Shaohuai; Chu, Xiaowen; Li, Bo
Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
The 4th Machine Learning and Systems Conference, MLSys 2021, Virtual, 5-9 April 2021
Shi, Shaohuai; Zhou, Xianhao; Song, Shutao; Wang, Xingyao; Zhu, Zilin; Huang, Xue; Jiang, Xinan; Zhou, Feihu; Guo, Zhenyu; Xie, Liqiang; Lan, Rui; Ouyang, Xianbin; Zhang, Yan; Wei, Jieqian; Gong, Jing; Lin, Weiliang; Gao, Ping; Meng, Peng; Xu, Xiaomin; Guo, Chenyang; Yang, Bo; Chen, Zhibo; Wu, Yongjian; Chu, Xiaowen





Conference paper 5

Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training
Proceedings: 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2020 / IEEE. Piscataway, NJ : IEEE, 2020, p. 744-751, Article number 9139681
Wang, Yuxin; Wang, Qiang; Shi, Shaohuai; He, Xin; Tang, Zhenheng; Zhao, Kaiyong; Chu, Xiaowen
Communication-Efficient Decentralized Learning with Sparsification and Adaptive Peer Selection
Proceedings - International Conference on Distributed Computing Systems, v. 2020-November, November 2020, article number 9355592, p. 1207-1208, Code 167382
Tang, Zhenheng; Shi, Shaohuai; Chu, Xiaowen
Communication-Efficient Distributed Deep Learning with Merged Gradient Sparsification on GPUs
Proceedings - IEEE INFOCOM, v. 2020-July, July 2020, article number 9155269, p. 406-415
Shi, Shaohuai; Wang, Qiang; Chu, Xiaowen; Li, Bo; Qin, Yang; Liu, Ruihao; Zhao, Xinxiao
FADNet: A Fast and Accurate Network for Disparity Estimation
International Conference on Robotics and Automation (ICRA 2020), Paris, France (Virtual Conference), 31 May - 31 August 2020
Wang, Qiang; Shi, Shaohuai; Zheng, Shizhen; Zhao, Kaiyong; Chu, Xiaowen
Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
24th European Conference on Artificial Intelligence (ECAI 2020), Santiago de Compostela, 29 August - 5 September 2020
Shi, Shaohuai; Tang, Zhenheng; Wang, Qiang; Zhao, Kaiyong; Chu, Xiaowen





Conference paper 4

A convergence analysis of distributed SGD with communication-efficient gradient sparsification
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence / Edited by Sarit Kraus. Vienna, Austria : International Joint Conferences on Artificial Intelligence, 2019, p. 3411-3417
Shi, Shaohuai; Zhao, Kaiyong; Wang, Qiang; Tang, Zhenheng; Chu, Xiaowen
A distributed synchronous SGD algorithm with global top-k Sparsification for low bandwidth networks
Proceedings: 2019 39th IEEE International Conference on Distributed Computing Systems, ICDCS 2019 / IEEE. Piscataway, NJ : IEEE, 2019, p. 2238-2247, Article number 8884924
Shi, Shaohuai; Wang, Qiang; Zhao, Kaiyong; Tang, Zhenheng; Wang, Yuxin; Huang, Xiang; Chu, Xiaowen
Computer-Aided Clinical Skin Disease Diagnosis Using CNN and Object Detection Models
Proceedings: 2019 IEEE International Conference on Big Data / IEEE. Piscataway, NJ : IEEE, 2019, p. 4839-4844, Article number 9006528
He, Xin; Wang, Shihao; Shi, Shaohuai; Tang, Zhenheng; Wang, Yuxin; Zhao, Zhihao; Dai, Jing; Ni, Ronghao; Zhang, Xiaofeng; Liu, Xiaoming; Wu, Zhili; Yu, Wu; Chu, Xiaowen
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
Proceedings - IEEE INFOCOM, v. 2019-April, April 2019, article number 8737367, p. 172-180
Shi, Shaohuai; Chu, Xiaowen; Li, Bo





Conference paper 4

A DAG Model of Synchronous Stochastic Gradient Descent in Distributed Deep Learning
Proceedings: 2018 IEEE 24th International Conference on Parallel and Distributed Systems, ICPADS 2018 / IEEE. Piscataway, NJ : IEEE, 2018, p. 425-432, Article number 8644932
Shi, Shaohuai; Wang, Qiang; Chu, Xiaowen; Li, Bo
Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes
32nd Annual Conference on Neural Information Processing (NIPS 2018), Montréal, Canada, 3-8 December 2018
Jia, Xianyan; Song, Shutao; He, Wei; Wang, Yangzihao; Rong, Haidong; Zhou, Feihu; Xie, Liqiang; Guo, Zhenyu; Yang, Yuanzhou; Yu, Liwei; Chen, Tiegang; Hu, Guangxiao; Shi, Shaohuai; Chu, Xiaowen
Performance modeling and evaluation of distributed deep learning frameworks on GPUs
Proceedings: IEEE 16th International Conference on Dependable, Autonomic and Secure Computing, IEEE 16th International Conference on Pervasive Intelligence and Computing, IEEE 4th International Conference on Big Data Intelligence and Computing and IEEE 3rd Cyber Science and Technology Congress, DASC-PICom-DataCom-CyberSciTec 2018 / IEEE. Piscataway, NJ : IEEE, 2018, p. 943-948, Article number 8512002
Shi, Shaohuai; Wang, Qiang; Chu, Xiaowen
Supervised learning based algorithm selection for deep neural networks
Proceedings: 2017 IEEE 23rd International Conference on Parallel and Distributed Systems, ICPADS 2017 / IEEE. Piscataway, NJ : IEEE, 2017, p. 344-351
Shi, Shaohuai; Xu, Pengfei; Chu, Xiaowen





Conference paper 2

Benchmarking state-of-the-art deep learning software tools
Proceedings: 2016 7th International Conference on Cloud Computing and Big Data, CCBD 2016 / IEEE. Piscataway, NJ : IEEE, 2016, p. 99-104, Article number 7979887
Shi, Shaohuai; Wang, Qiang; Xu, Pengfei; Chu, Xiaowen
Performance Evaluation of Deep Learning Tools in Docker Containers
Proceedings: 2017 3rd International Conference on Big Data Computing and Communications, BigCom 2017 / IEEE. Piscataway, NJ : IEEE, 2017, p. 395-403, Article number 8113094
Xu, Pengfei; Shi, Shaohuai; Chu, Xiaowen





Conference paper 1

Mixed Precision Method for GPU-based FFT
Proceedings: The 14th IEEE International Conference on Computational Science and Engineering, CSE 2011 & The 11th International Symposium on Pervasive Systems, Algorithms, and Networks, I-SPAN 2011 & The 10th IEEE International Conference on Ubiquitous Computing and Communications, IUCC 2011 / IEEE. Piscataway, NJ : IEEE, 2011, p. 580-586, Article number 6062934
Qi, Shuhan; Wang, Xuan; Shi, Shaohuai





Conference paper 1

The GPU-based string matching system in adavanced AC algorithm
Proceedings: 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, 10th IEEE Int. Conf. Scalable Computing and Communications, ScalCom-2010 / IEEE. Piscataway, NJ : IEEE, 2010, p. 1158-1163, Article number 5577901
Peng, Jiangfeng; Chen, Hu; Shi, Shaohuai





2011 1

Mixed Precision Method for GPU-based FFT
Proceedings: The 14th IEEE International Conference on Computational Science and Engineering, CSE 2011 & The 11th International Symposium on Pervasive Systems, Algorithms, and Networks, I-SPAN 2011 & The 10th IEEE International Conference on Ubiquitous Computing and Communications, IUCC 2011 / IEEE. Piscataway, NJ : IEEE, 2011, p. 580-586, Article number 6062934
Qi, Shuhan; Wang, Xuan; Shi, Shaohuai Conference paper

2010 1

The GPU-based string matching system in adavanced AC algorithm
Proceedings: 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, 10th IEEE Int. Conf. Scalable Computing and Communications, ScalCom-2010 / IEEE. Piscataway, NJ : IEEE, 2010, p. 1158-1163, Article number 5577901
Peng, Jiangfeng; Chen, Hu; Shi, Shaohuai Conference paper


No Publications






Teaching Assignment
2021-22 Winter 0 2021-22 Fall 0 2020-21 Summer 0 2020-21 Spring 1 2020-21 Winter 0 2020-21 Fall 0


COMP4901Q High Performance Computing


No Teaching Assignments


No Teaching Assignments


No Teaching Assignments


No Teaching Assignments


No Teaching Assignments









ProjectsFrom January 2020 to December 2022



相关话题/香港科技大学 工学院