Cloud data center has become one of the most important IT infrastructures. Building high-performance data centers at low cost requires collective effort of the entire global community. As an attempt to initiate a platform that brings together the most important and forward-looking work in the area for intriguing and productive discussions, the Sixth Workshop on Hot Topics on Data Centers (HotDC 2021) will be held in Beijing, China on December 2nd, 2021.
HotDC 2021 consists of by-invitation-only presentations from top academic and industrial groups around the world. The topics include a wide range of data-center related issues, including the state-of-the-art technologies for server architecture, storage system, data-center network, resource management, etc. Besides, HotDC 2021 includes a student poster session to present recent research works from the data-center teams in Institute of Computing Technology, Chinese Academy of Sciences. The HotDC workshop expects to provide a forum for the cutting edge in data-center research, where researchers/engineers can exchange ideas and engage in discussions with their colleagues around the world. Welcome to HotDC 2021!
|09:00 - 09:10||Opening remark|
Yungang Bao, Institute of Computing Technology, Chinese Academy of Sciences
|09:10 - 09:50||Persistent Memory System: A Computable View|
Yu Hua, Huazhong University of Science and Technology
|09:50 - 10:30||存储系统数据去重技术探索|
|10:30 - 10:40||Break|
|10:40 - 11:20||Cloud Systems to the Next Level: From High Availability to Observability|
Ryan Huang, Johns Hopkins University
|11:20 - 12:00||Rethinking the Designs of Scalable Storage Class Memory|
Jie Zhang, Peking University
|12:00 - 13:30||Lunch|
|13:30 - 14:10||DCN领域研究创新与展望 |
|14:10 - 14:50||面向多租户机器学习训练的聚合传输协议 |
|14:50 - 15:00||Break|
|15:00 - 15:40||Towards High-throughput Computing in Serverless|
Laiping Zhao, Tianjin University
|15:40 - 16:20||Selective Replication in Memory-Side GPU Caches |
Xia Zhao, Academy of Military Science
|16:20 - 17:00||Boosting Data Centers Performance with the Entangling Instruction Prefetcher|
Alberto Ros, University of Murcia
17:00 - 19:00, Lecture hall on the fourth floor, ICT
Topic: Persistent Memory System: A Computable View
Bio: Dr.Yu Hua is a professor at the School of Computer Science and Technology, Huazhong University of Science and Technology. He was Postdoc Research Associate in McGill University in 2009 and Postdoc Research Fellow in University of Nebraska-Lincoln in 2010-2011. He obtained his B.E and Ph.D degrees respectively in 2001 and 2005. His research interests include cloud storage systems, file systems, non-volatile memory architectures, etc. His papers have been published in major conferences, including OSDI, FAST, MICRO, USENIX ATC, SC, HPCA. He serves as PC (vice) chair in ACM APSys 2019 and ICDCS 2021, and PC member in OSDI, FAST, ASPLOS, USENIX ATC, EuroSys, SC. He is the distinguished member of CCF, and senior member of ACM and IEEE. He has been selected as the Distinguished Speaker of ACM and CCF.
Abstract: Persistent memory (PM) provides large capacity, near-zero standby power and high performance for real-world applications. The PM-enabled system becomes more and more important to bridge the gap between applications and devices. In this talk, I will present our recent work that exploits and explores the near-data property within the computable PM, which delivers high performance, efficiently handles crash consistency and supports concurrent operations.
Bio: 哈尔滨工业大学（深圳）计算机学院副教授、博士生导师。主要研究方向为数据存储系统、去重压缩等，在FAST、USENIX ATC、IEEE TC、PIEEE等会议和期刊上发表论文60余篇，授权国内外专利25项；研究工作曾获得教育部自然科学一等奖、湖北省科技进步一等奖、中国电子学会优秀博士学位论文奖等荣誉；研究成果已被Ceph、rdedup等多个知名开源项目采纳。
Topic: Cloud Systems to the Next Level: From High Availability to
Bio: Dr. Ryan (Peng) Huang is an Assistant Professor in the Department of Computer Science at Johns Hopkins University. He leads the Ordered Systems Lab at JHU, which conducts research broadly in distributed systems, operating systems, cloud computing, and mobile systems. His work received multiple best paper awards in top systems conferences. He is a recipient of the NSF CAREER Award.
Abstract: Classic techniques such as state-machine replication have made it feasible to construct fault-tolerant distributed systems at extremely large scales. However, they usually make simple assumptions about the failure model, which do not reflect the complex issues such as gray faults that cloud systems today frequently experience. These complex faults present significant challenges in building highly-available cloud systems. In this talk, I will discuss this problem, and make a case for observability as a critical system design metric. I will describe our recent work to enhance observability to effectively detect, localize, predict, and mitigate complex faults in large systems. I will conclude by outlining some open challenges.
Topic: Rethinking the Designs of Scalable Storage Class Memory
Bio: Dr. Jie Zhang is currently a tenure-track assistant professor of Peking University, China. Before that, he worked as a postdoctoral researcher at KAIST, South Korea. His research interests include storage system, the emerging non-volatile memory and heterogeneous computing. So far, he has published over 40 papers, including 10 CCF-A conference papers as the first author. His research has been listed as "KAIST Breakthroughs 50th Innoversary". For more details, please visit his personal websitehttps://jiezhang-camel.github.io/.
Abstract: The computing power of supercomputers is increasing exponentially by employing more computing nodes. However, the scalability of the memory capacity has fallen behind this increasing trend of computing power. Nowadays, the memory and storage systems have experienced significant technology shifts. Such technology promotions have motivated researchers to re-think and re-design the existing system organization and hardware architecture. This talk mainly shares our research experience of building up a scalable storage class memory for the existing computing system. Our solutions are proposed to address the challenges of a heavy software-stack intervention and eliminate the overheads incurred by the physical boundaries.
Bio: 就职于华为数通产品线布尔实验室，从事数据中心前沿技术的研究工作。他于2020年2月毕业于UCL Optical Networks Group并获得博士学位，期间主要研究内容围绕optical DCN和disaggregated DCN，研究成果曾多次发表于光领域顶会和期刊。
Topic: Towards High-throughput Computing in Serverless
Bio: Dr. Laiping Zhao is an associate professor with the college of intelligence and computing, Tianjin University. He received the BS and MS degrees from Dalian University of Technology, China, in 2007 and 2009, and the PhD degree from the Department of Informatics, Kyushu University, Japan, in 2012. His research interests include cloud computing and operating system and has over 40 publications in this field, e.g., SC, Eurosys, HPDC, ICDCS, ICPP, TPDS, TCC, TSC, JSA. His research is supported by funding from National Key Research and Development Program, NSFC, Tianjin Municipal S&T Bureau, Huawei, Meituan, etc.
Abstract: Serverless computing has grown rapidly in recent years due to its low-cost and management-free operation properties. Many applications are being deployed in commercial serverless platform. We characterize the serverless computing features and find that serverless computing tends to decrease the resource efficiency severely. We explore to improve the resource efficiency in serverless through fine-grained resource allocation and proactive scheduling.
Topic: Selective Replication in Memory-Side GPU Caches
Bio: Xia Zhao received the PhD degree in computer science and engineering from Ghent University in 2019. He currently is an Assistant Researcher at the Academy of Military Science, China. His research interests include GPU architecture in general, and multi-program execution, cache hierarchy optimization and Network-on Chip (NoC) design more in particular. He has served as a member of the External Review Committee of the leading computer architecture conferences ISCA and MICRO.
Abstract: Data-intensive applications put immense strain on the memory
Graphics Processing Units (GPUs). To cater to this need, GPU memory systems distribute
requests across independent units to provide high bandwidth by servicing requests (mostly) in
parallel. We find that this strategy breaks down for shared data structures because the shared
Last-Level Cache (LLC) organization used by contemporary GPUs stores shared data in a single
LLC slice. Shared data requests are hence serialized — resulting in data-intensive
applications not being provided with the bandwidth they require. A private LLC organization
can provide high bandwidth, but it is often undesirable since it significantly reduces the
effective LLC capacity.
Topic: Boosting Data Centers Performance with the Entangling
Bio: Alberto Ros is full professor in the Computer Engineering Department at the University of Murcia, Spain. Funded by the Spanish government to conduct the PhD studies he received the PhD in computer science from the University of Murcia in 2009. He held postdoctoral positions at the Universitat Politècnica de València and Uppsala University. He received an European Research Council Consolidator Grant in 2018 to improve the performance of multicore architectures. Working on cache coherence, memory hierarchy designs, memory consistency, and processor microarchitecture, he has co-authored more than 80 peer-reviewed articles. He has been inducted into the ISCA Hall of Fame. He is IEEE Senior member.
Abstract: As software-as-a-service and Cloud computing become increasingly
popular, server and Cloud applications exhibit notoriously large instruction sets that do not
fit in the first level instruction cache (L1I), leading to high L1I miss rates and therefore
stalls. This causes significant performance degradation, in addition to wasteful energy
expenditure and under-utilization of resources. Prefetching instructions emerges then as a
fundamental technique for designing high-performance data-center computers.
Yungang Bao, Institute of Computing Technology, Chinese Academy of Sciences
Mi Zhang, Institute of Computing Technology, Chinese Academy of Sciences
Ke Liu, Institute of Computing Technology, Chinese Academy of Sciences
Sa Wang, Institute of Computing Technology, Chinese Academy of Sciences
Dejun Jiang, Institute of Computing Technology, Chinese Academy of Sciences
Wanling Gao, Institute of Computing Technology, Chinese Academy of Sciences
Ke Zhang, Institute of Computing Technology, Chinese Academy of Sciences
Biwei Xie, Institute of Computing Technology, Chinese Academy of Sciences
Wenya Hu, Institute of Computing Technology, Chinese Academy of Sciences
Zhiwei Lai, Institute of Computing Technology, Chinese Academy of Sciences
Zirui Wang, Institute of Computing Technology, Chinese Academy of Sciences
Zhimeng Li, Institute of Computing Technology, Chinese Academy of Sciences
Mi Zhang (email@example.com)
Zirui Wang (firstname.lastname@example.org)