HotDC2016--先进计算机系统研究中心

Introduction

Cloud data center has become the most important IT infrastructure that people use every day. Building efficient future data centers will require collective efforts of entire global community.As an attempt to initiate a platform that will bring together the most important and forward-looking work in the area for intriguing and productive discussions, the first Workshop on Hot Topics on Data Centers (HotDC 2016) will be held in Beijing on September 26-27, 2016. HotDC 2016 consists of by-invitation-only presentations from top academic and industrial groups around the world. The topics include a wide range of data-center related issues, including the state-of-the-art technologies for server architecture, storage system, data-center network, resource management etc. Besides, HotDC 2016 provides a special session including talks presenting recent research works from the data-center team in Institute of Computing Technology, Chinese Academy of Sciences. The HotDC workshop expects to provide a forum for the cutting edge in data center research, where researchers/engineers can exchange ideas and engage in discussions with their colleagues around the world. Welcome to HotDC 2016!

Organizing Committee

General Chairs

Lixin Zhang Institute of Computing Technology, Chinese Academy of Sciences Yungang Bao Institute of Computing Technology, Chinese Academy of Sciences

Program Chair

Dejun Jiang Institute of Computing Technology, Chinese Academy of Sciences

Workshop Schedule

Conference Venue: Kunlun Room,Vision Hotel, BeijingDates: 26 Sept – 27 Sept

26 Sept 2016 Monday
08:45 – 09:00	Opening remark
09:00 – 09:50	Keynote 1：Microsoft’s DatacentersSpeaker: Leendert van Doorn, Microsoft
09:50 – 10:40	Keynote 2：The Future of Cloud and Cloud Data CentersSpeaker: John Carter, IBM
10:40 – 11:00	Coffee break
11:00 – 11:50	Keynote 3：The Next Era of ComputingSpeaker: Jian Ouyang, Baidu
12:00 – 14:00	Lunch
14:00 – 14:50	Keynote 4：The Road to HPCSpeaker: Geraint North, ARM
14:50 – 16:00	Panel
16:00 – 16:20	Coffee break
16:20 – 18:00	Talks by ICT
18:30 – 20:00	Reception
27 Sept 2016 Tuesday
09:00 – 09:50	Keynote 5：Large-scale Cluster Management at Google with Borg Speaker: Xiao Zhang, Google
09:50 – 10:40	Keynote 6：How to Enable Flash Storage in Data Center?Speaker: Zhongjie Wu, Memblaze
10:40 – 10:50	Coffee break
10:50 – 11:40	Keynote 7：FireBox: Designing the Warehouse-Scale Computer of 2020Speaker: Martin Maas, Berkeley
11:40 – 12:30	Keynote 8：CloudEth: Born for the cloud, Grown in the cloudSpeaker: Chunzhi He, Huawei
12:30 – 14:00	Closing remarks + Lunch

Keynote Speech

8:40-8:50

Opening remark

8:50-9:50

Keynote 1: Microsoft’s Datacenters

Speaker: Leendert

Abstract:During this talk I’ll discuss some of the scale of Microsoft’s datacenters, some of its key workload characteristics and design constraints and how those shape our hardware designs. A key element of this is how to control the server cost while taking advantage of the upcoming Silicon disruptions. I’ll specifically talk about a set of new silicon technologies that are on the 3-6 year horizon that can help our datacenter designs but also pose new challenges that haven’t been considered before.

Bio: Leendert is a Distinguished Engineer in Microsoft’s Cloud Azure organization where he is responsible for figuring out what systems Microsoft designs and develops for its future datacenters (3 years and beyond). Before joining Microsoft, he was a Corporate Fellow/CVP at Advanced Micro Devices (AMD) where he was responsible for working on long-term roadmaps with all of AMD’s software partners and he was the technical lead AMD’s China strategy. Before AMD he was a senior manager at IBM’s T.J. Watson Research Center where he ran system security and virtualization teams. Leendert has a Ph.D. from the Vrije Universiteit in Amsterdam, The Netherlands..

9:50-10:40

Keynote 2: The Future of Cloud and Cloud Data Centers

Speaker: John Carter

Abstract:In this talk, we will explore trends, technologies, and use cases that are transforming the way cloud systems are being designed and deployed. Mobile, big data, and analytics have been driving much of the recent explosive growth in cloud and significantly influenced how cloud platforms and data centers are architected. Looking ahead, cognitive/AI, IoT, and the migration of traditional enterprise workloads into the cloud will lead to even more dramatic changes in future cloud platforms and data centers are designed. The talk will then explore recent work on software-defined data centers, cognitive cloud, and the role of accelerators in data-intensive cloud workloads. Finally, the talk will conclude with lessons learned while building IBM’s next-generation cloud platform, including specific guidance to researchers regarding what problem areas are (and are not) likely to have the most impact going forward.

Bio: Dr. John Carter is a Lead Architect of IBM’s CloudLab, where he helps lead the team designing IBM’s next-generation cloud infrastructure platform, including cloud orchestration, data center engineering, server/network design, and system integration. Prior to joining IBM’s CloudLab, Dr. Carter spent twenty years leading research at IBM and the University of Utah on a broad range of systems topics, including large-scale distributed systems, software-defined networks and storage, energy and thermal management of servers and data centers, advanced memory system design, and mobile/cloud back ends. Dr. Carter received his PhD in Computer Science from Rice University in 1993, and is a Senior Member of the ACM and IEEE.

11:00-11:50

Keynote 3: The Next Era of Computing

Speaker: Jian Ouyang

Abstract:As the Moore’s law is going to the end. We believe that the architecture and hardware are driven by emerging applications. We shape the new methodology as “software-defined architecture and hardware”. And we also believe that this is the appropriate way to open the next era of computing. In this talk, I will take Baidu AI accelerator as an example to show how we embrace and create the next era of computing.

Bio: Baidu principal architect. He leads the team to look for new architecture and system for data center and autonomous car. He has published papers on ASPLOS2014, hotchips 2014, hotchips 2016 and other conferences.

14:00-14:50

Keynote 4: The Road to HPC

Speaker: Geraint North

Abstract:This talk will describe the ARM technology and business model, and the capabilities and benefits that ARM’s technology can bring to the server and HPC markets. It will cover specific microprocessor technology that ARM has developed for the HPC market, and the rich software ecosystem that is built around it.

Bio: Geraint North is ARM’s Distinguished Engineer for Server and HPC Tools. He joined ARM in 2014 as a founding member of ARM’s Manchester Design Centre, which performs HPC-specific activity on compilers, libraries and performance tools, including support for ARM’s recently announced HPC-focused Scalable Vector Extensions (SVE). Previously a Master Inventor at IBM, Geraint was the storage architect for IBM’s Cloud Systems Software group, designing products built on OpenStack. He has also worked on a wide range of products and research areas across enterprise storage, POWER systems hardware, AIX and BlueGene. Prior to IBM, Geraint was a Principal Engineer at Transitive Corporation (a spin-out from Manchester University), developing dynamic binary translation technology for Apple, Silicon Graphics, IBM and others. Geraint holds over twenty patents in the fields of dynamic binary translation, enterprise storage and microprocessor high-availability.

27 Sept 2016 Tuesday

9:00-9:50

Keynote 5: Large-scale cluster management at Google with Borg

Speaker: Xiao Zhang

Abstract:Google’s Borg system is a cluster manager that runs hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters each with up to tens of thousands of machines. It achieves high utilization by combining admission control, efficient task-packing, over-commitment, and machine sharing with process-level performance isolation. Borg simplifies life for its users by offering a declarative job specification language, name service integration, real-time job monitoring, and tools to analyze and simulate system behavior. In this talk, we present a summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions.

Bio: Xiao Zhang works on cluster resource management at Google as staff software engineer. He received his PhD from University of Rochester and BS from University of Science and Technology of China, both in Computer Science. His research focuses on interaction between Operating Systems and Computer Architecture.

9:50-10:40

Keynote 6: How to Enable Flash Storage in Data Center?

Speaker: Zhongjie Wu

Abstract:More and more data center start to use SSD to deal with storage challenges. Comparing to traditional storage media, SSD has different characteristics and need to change lots of system design to resolve SSD new issues and leverage advantages. In this topic, internal architecture of high performance SSD and major technologies will be discussed, besides this we will talk about what storage system design need to be changed and what new approaches are invented to build flash storage system. Such like lock-free algorithm to improve multi-core efficiency, NVMe over fabric to enhance network exportation and new data protection method for NVMe SSD.

Bio: Zhongjie Wu, Senior director in Memblaze, responsible for both flash storage system and firmware R&D. Recently FlashRAID & BlazeArray system are developed and related speech are delivered in 2015/2016 FMS. Before Memblaze, he worked for EMC DataDomain, and was responsible for backup storage R&D. before that, he worked in institute of computing technology Chinese academy, and was in charge of storage virtualization team. Until now, more than 20 patents are applied and more than 10 papers are published.

10:50-11:40

Keynote 7: FireBox: Designing the Warehouse-Scale Computer of 2020

Speaker: Martin Mass

Abstract:FireBox is an ongoing project at UC Berkeley that proposes a new system architecture for third-generation Warehouse-Scale Computers (WSCs). We envision future WSCs to be composed of multiple “FireBoxes”, a basic building block containing a thousand compute sockets and 100 Petabytes of disaggregated non-volatile memory connected via a low-latency, high-bandwidth optical switch. FireBoxes are connected to each other, peripherals and the outside world through a WSC-level network, to form a 1MW WSC with a million cores and an exabyte of non-volatile storage. Within a FireBox, each compute socket contains a multi-core System-on-a-Chip (SoC) connected to high-bandwidth on-package DRAM. Fast SoC network interfaces reduce the software overhead of communicating between application services, and high-radix network backplane switches connected by Terabit/sec optical fibers reduce the network’s contribution to tail latency. In this talk, I will discuss the hardware trends that motivate the FireBox vision, and the research problems we are investigating in its context. I will then highlight work from Berkeley towards simulating and prototyping FireBox SoCs based on the free and open RISC-V ISA. Finally, I will present an example from my own FireBox research, which is investigating hardware support for garbage-collected programming languages in the context of FireBox’s custom SoCs.

Bio: Martin Maas is a final-year PhD student in the Computer Science department at UC Berkeley, working with Krste Asanović and John Kubiatowicz. His main research interests are in managed language runtime systems, computer architecture and operating systems. He is currently working on hardware and software support for managed languages in data centers, and has previously worked on co-scheduling of parallel runtime systems and architectural support for security.

11:40-12:30

Keynote 8: CloudEth: Born for the cloud, Grown in the cloud

Speaker: Chunzhi He

Abstract:Cost, performance and scalability is three elements of cloud data center networking design, this talk proposes CloudEth to seek a new balance in the three elements, and introduce some test results of CloudEth.

Bio: Chunzhi He received his B.Eng. and Master degrees in communication engineering from University of Electronic Science and Technology of China in 2006 and 2009, respectively. He received his Ph.D. degree in the Department of Electrical and Electronic Engineering at the University of Hong Kong in 2014. He joined Huawei Technologies, Co., Ltd in 2014, where he is currently a R&D engineer working on high-performance packet switching networks.

Talk By ICT

Talk 1: Scaling Out and Up Datacenter Servers with FPGAs

Speaker: Dr. Ke Zhang

Abstract:The need to perform data analytics on exploding data volumes coupled with the rapidly changing workloads in cloud computing places great pressure on datacenter servers. To mitigate this problem and improve hardware resource utilization across servers within a computing rack, emerging rack-scale computing servers relax the boundaries between discrete machines. Examples include Intel’s Rack Scale Architecture, Hewlett Packard’s Moonshot, and UC Berkeley’s FireBox, in which compute, network and storage resources of servers can be efficiently modularized and remotely accessed within the same rack. In this talk, I will introduce an in-house FPGA-based research platform (named Titian2), which is a scalable system-level emulator for datacenter servers with all programmability. In addition, two approaches will be presented, DEOI (Direct Extension of On-chip Interconnects) and Co-DIMM (Common Dual In-line Memory Module Channels), which can support efficient data sharing among server nodes and increase the scalability of datacenter servers from two perspectives: scaling out horizontally and scaling up vertically. We have verified these two potential solutions and obtained performance improvement with benchmarks and applications (e.g. in-memory database) running on Titian2. We also provide a live demo. In this live demo, we will show the efficient data movement on our in-house Titian2 hardware platform utilizing two mechanisms we proposed, DEOI (Direct Extension of On-chip Interconnects) and Co-DIMM (Common Dual In-line Memory Module Channels).

Talk 2: Venice：A Cost Effective Data Center Server Architecture

Speaker: : Dr. Rui Hou

Abstract:: Consolidated server racks are quickly becoming the backbone of IT infrastructure for science, engineering, and business, alike. These servers are still largely built and organized as when they were distributed, individual entities. Given that many fields increasingly rely on analytics of huge datasets, it makes sense to support flexible resource utilization across servers to improve cost-effectiveness and performance. In this talk, I will introduce our work named as Venice, a family of data-center server architectures that builds a strong communication substrate as a first-class resource for server chips. Venice provides a diverse set of resource-joining mechanisms that enables user programs to efficiently leverage non-local resources. To better understand the implications of design decisions about system support for resource sharing we have constructed a hardware prototype that allows us to more accurately measure end-to-end performance of at-scale applications and to explore tradeoffs among performance, power, and resource-sharing transparency. We present results from our initial studies analyzing these tradeoffs when sharing memory, accelerators, or NICs. We find that it is particularly important to reduce or hide latency, that data-sharing access patterns should match the features of the communication channels employed, and that inter-channel collaboration can be exploited for better performance.

Talk 3: A New Perspective on Software-Defined Architecture: The Computer as a Network

Speaker: : Zihao Yu

Abstract:: Traditional computer architecture primarily leverages abstracted interfaces such as instruction set architecture (ISA) and virtual memory mechanism to convey an application’s information to the hardware. However, as pointed out in the community white paper “21st Century Computer Architecture”, such conventional interfaces are insufficient to convey more high-level requirement of applications to the hardware such as quality-of-service (QoS) and security, which are extremely important to data centers in the cloud era. We propose a new computer architecture PARD (Programmable Architecture for Resourcing-on-Demand) that provides a new programming interface to enable more software-defined functionalities. PARD is inspired by the perspective that a computer is inherently a network in which hardware components communicate via packets (e.g., over the NoC or PCIe). Thus we can apply networking technologies, e.g. software-defined networking (SDN), to this intra-computer network. In this talk, I will present an FPGA-based PARD prototype to show how to reconstruct a computer to be an SDN-like network, which enables new functionalities like fully hardware-supported virtualization and programmable application-specific QoS. Additionally, I will present ongoing work, i.e., QoS-aware data center software stack, including hypervisors, OS kernels and cluster management systems.

Talk 4: BigDataBench: an Open Source Big Data Benchmark Suite

Speaker: : Wanling Gao

Abstract:: : Booming big data sparks tremendous outpouring of interest in storing and processing these data, and consequently spawns a variety of big data systems. Big data benchmarking is particularly important and provides applicable yardsticks for evaluating booming big data systems. However, the complexity, diversity and rapid evolution of big data systems raise great challenges in big data benchmarking. This talk presents an open source big data benchmark suite — BigDataBench, which adopts an iterative and incremental methodology, covers five representative application domains and contains diverse data models and workload types.

Talk 5: Twin-Load: Bridging the Gap between Conventional Direct-attached and Novel Buffer-on-Board Memory Systems

Speaker: : Tianyue Lu

Abstract:: Conventional systems with direct-attached DRAM struggle to meet growing memory capacity demands. Recent buffer-on-board (BOB) designs move some memory controller functionality to a separate buffer chip, which lets them support larger capacities (by adding more DRAM or denser, non-volatile components). Most processors exclusively implement either the direct-attached or the BOB approach. Combining both technologies within one processor has obvious benefits, but current memory-interface requirements complicate this straightforward solution. We propose Twin-Load technology to enable one processor to support both direct-attached and BOB memory. We build an asynchronous protocol over the existing, synchronous interface by splitting each memory read into twinned loads. The first acts as a prefetch to the buffer chip, and the second asynchronously fetches the data.

Talk 6: Hybrid Memory Management for Future Data-Center Servers

Speaker: : Wei Wei

Abstract:: DRAM-only memory systems suffer high energy consumption from refresh operations and power leakage, leading to higher memory system energy in data-center servers. Compared to DRAM, non-volatile memory technologies such as PCM, STT-RAM, and ReRAM require no refresh and promise better scalability. Hybrid memory systems comprising DRAM and on-volatile memories (NVMs) have thus been proposed. In this talk, I will first present our software/hardware cooperative hybrid memory management system, which effectively improves energy efficiency. I will also present our hybrid memory aware CPU cache replacement policy, which is the first cache replacement mechanism based on the hybrid memory as far as we know.