Yungang Bao
       Professor @ ICT, CAS                                                                                         (Chinese version)
       Director @ ACS , ICT, CAS

Yungang is a professor of the State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). He is the director of Research Center for Advanced Computer System (ACS) .

Yungang received his BS degree in computer science from Nanjing University in 2003, and his PhD degree in computer engineering from ICT, CAS in 2008, supervised by Prof. Jianping Fan and Prof. Mingyu Chen. During 2010-2012, he did postdoc research in Department of Computer Science, Princeton University, working with Prof. Kai Li on the Princeton Application Repository for Shared-Memory Computers (PARSEC) project. He was the winner of CCF-Intel Young Faculty Researcher Program of the year for 2013. He received Outstanding Award of Youth Innovation Promotion Association, Chinese Academy of Sciences in 2017 and won China's National Lofty Honor for Youth under 40 in 2019.


    NO. 6, Kexueyuan South Road, Zhongguancun,
    Beijing, P.R.China 100190
    Phone: (86)10-62601034
    Email:  baoyg at ict dot ac dot cn

  • Computer Architecture

  • Operating System

  • System Performance Modeling and Evaluation

  • Cloud Computing

  • CV

  • LvNA: Labeled von Neumann Architecture
               Step-1: View a computer as a network

               Step-2: Reconstruct a computer as a software-defined network (SDN)

    Based on the insight of Computer as a Network, this project investigates architectural support for managing and controlling shared hardware resources, which impose significant challenges on warehouse-scale datacenters and real-time systems. LvNA enables a new hardware/software interface by introducing a hardware labeling mechanism to convey software's semantic information such as QoS and security to the underlying hardware. LvNA is able to correlate hardware labels with various entities (e.g., virtual machine, process and thread), propagate labels in the whole datapaths and program differentiated services rules based on labels. We have implemented a RISC-V based FPGA prototype (a.k.a. Labeled RISC-V) that has been already open-sourced:

    6/2019: I was invited to give a talk on open-source chip ecosystem at SIGARCH Visioning Workshop co-located with ISCA 2019. [slides]
    4/2019: A paper on controling QoS of SMT processor got accepted to ICS. [pdf]
    4/2019: A paper on analysis of Alibaba's datacenter traces got accepted to IWQoS. [pdf]
    3/2019: A demo vedio of open sourced Labeled RISC-V Cluster.
    10/2018: A turorial will be held at MICRO'2018 [Tutorial Page]
    9/2018: I was invited to give a keynote presentation at ARM Research Summit 2018: The Case for Labeled Computer Architecture.
    7/2018: The FlameCluster prototype works -- a cluster of eight Labeled RISC-V nodes that can run Redis, Xapian etc. FlameCluster can enforce end-to-end performance isolation through labeled HW/SW co-design including labeled SoC, labeled container, labeled Ceph and labeled TCP/IP stack.
    6/2018: A turorial was held at ISCA'2018 [Tutorial Page]
    11/2017: Project progress was presented at the 7th RISC-V workshop [slides] [video]
    11/2017: A collaborative work with Princeton University got published on ICCD. [pdf]
    10/2017: A paper on LvNA was presented at the CARRV workshop co-located with MICRO. [pdf]
    5/2017: Labeled RISC-V was presented at the 6th RISC-V workshop. [slides] [video]
    5/2017: An open-source Labeled RISC-V RTL released:
    2/2017: A position paper on Labeled von Neumann Architecture was published. [pdf]
    7/2016: Received a five-year funding of ~$4M from the Ministry of Science and Technology (MOST).
    6/2016: The open-sourced FPGA-based PARD prototype released at ISCA 2016: CaaN Tutorial
    10/2015: Invited to parcipate in Dastughl Seminar on "Rack-Scale Computing": Dastuhl Seminar
    3/2015: Presentation at ASPLOS'15: Slides (12MB)
    1/2015: GEM5-based full-system PARD simulator was open sourced: Github
    11/2014: Our first paper on PARD is accepted by ASPLOS'15 (ranked #8 of ~280 submissions). [pdf]
    10/2012: Yungang returned back to ICT from Princeton University and started a new project: Programmable Architecture for Resourcing-on-Demand (PARD) .

  • PARSEC 3.0: The Princeton Application Repository for Shared-Memory Computers

    PARSEC 3.0 has made three major changes: 1) Add network benchmarks as well as a user-level parallel TCP/IP stack. 2) Provide SPLASH-2 and the inputs-enlarged SPLASH-2x. 3) Redesign the framework for supporting external suites.

    12/2016: A summary paper published on ACM SIGARCH Computer Architecture News. [pdf]
    2/2015: New version updated.
    9/2012: PARSEC 3.0 Beta version released. [doc]
    6/2011: SPLASH-2x released with several input datasets at different scales.
    6/2011: A Tutorial on PARSEC 3.0 was held at ISCA'11. [slides]
    10/2010: Yungang started this project as a postdoc working with Prof. Kai Li.

  • HMTT: Hybrid Memory Trace Toolkit


    Photo: HMTTv4 on a server collecting traces of DDR4 address and data

    HMTT was my Ph.D. project supervised by Prof. Mingyu Chen who is leading a lab of ACS and has been keeping building more advanced HMTT over the past decade. Currently, there are two versions: HMTTv3 supports DDR3-800/1600 and HMTTv4 supports DDR4. HMTT can provide off-chip memory traces of many real-world applications, e.g., SPECCPU, SPECjbb, TPC-H/TPC-C on Oracle, and SPECWeb on Apache, with abundant important information, such as timestamp, pid, cpu-request/io-request, r/w, virt_addr and phys_addr.

    3/2019: A demo vedio of HMTT.
    2/2018: HMTTv4 that can collect traces of both address and data on DDR4 bus released. Contact Prof. Mingyu Chen for more details.
    4/2016: HMTTv3 was updated to support DDR3-1333.
    10/2015: HMTTv3 was deployed in Huawei Research Lab. [working photo]
    4/2014: A paper on characterizing VMs' memory accesses was published on VEE. [pdf]
    2/2014: A summary paper was published on TACO. [pdf]
    2/2013: HMTT Tutorial @ HPCA 2013 [slides]
    9/2012: A lightway lock profiling tool w/ HMTT was published on PACT. [pdf]
    4/2012: HMTT was able to distinguish objects' memory access, published on ISPASS. [pdf]
    2/2010: HMTT collected DMA traces for I/O access optimization, published on HPCA. [pdf]
    6/2008: The first paper on HMTT was published on SIGMETRICS. [pdf]

(Full List)
  • Xin Jin, Yaoyang Zhou, Bowen Huang, Zihao Yu, Xusheng Zhan, Huizhe Wang, Sa Wang, Ningmei Yu, Ninghui Sun, Yungang Bao, QoSMT: Supporting Precise Performance Control for Simultaneous multithreading Architecture . to appear in ACM International Conference on Supercomputing (ICS), 2019. [pdf]

  • Jing Guo, Zihao Chang, Sa Wang, Haiyang Ding, Yihui Feng, Liang Mao, Yungang Bao, Who Limits the Resource Efficiency of My Datacenter: An Analysis of Alibaba Datacenter Traces . to appear in IEEE/ACM International Symposium on Quality of Service (IWQoS), 2019. [pdf]

  • Wenlong Ma, Yuqing Zhu, Cheng Li, Mengying Guo, Yungang Bao, BiloKey: A Scalable Bi-Index Locality-Aware In-Memory Key-Value Store . to appear in IEEE Transactions on Parallel and Distributed Systems (TPDS), 2019.

  • Ke Zhang, Yisong Chang, Mingyu Chen, Yungang Bao, Zhiwei Xu, Computer Organization and Design Course with FPGA Cloud . in the SIGCSE Technical Symposium (SIGCSE), 2019. [pdf]

  • Yiwen Shao, Sa Wang, Yungang Bao, CryptZip: Squeezing out the Redundancy in Homomorphically Encrypted Backup Data. 9th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys), 2018. [pdf]

  • Qun Huang, Patric P. C. Lee, Yungang Bao, SketchLearn: Relieving User Burdens in Approximate Measurement with Automated Statistical Inference. Annual Conference of the ACM Special Interest Group on Data Communication (SIGCOMM), 2018. [pdf]

  • Zihao Yu, Bowen Huang, Jiuyue Ma, Ninghui Sun, Yungang Bao, Labeled RISC-V: A New Perspective on Software-Defined Architecture. First Workshop on Computer Architecture Research with RISC-V (CARRV 2017) Co-located with MICRO, 2017. [pdf]

  • Yuqing Zhu, Jianxun Liu, Mengying Guo, Yungang Bao, Wenlong Ma, Zhuoyue Liu, Kunpeng Song, Yingchun Yang, BestConfig: tapping the performance potential of systems via automatic configuration tuning. Symposium on Cloud Computing (SoCC), 2017. [pdf]

  • Tianwei Zhang, Yuan Xu, Yungang Bao, Ruby B. Lee, CloudShelter: Protecting Virtual Machines' Memory Resource Availability in Clouds, IEEE International Conference on Computer Design (ICCD), 2017. [pdf]

  • Shiqi Lian, Yinhe Han, Ying Wang, Yungang Bao, Hang Xiao, Xiaowei Li, Ninghui Sun, Dadu: Accelerating Inverse Kinematics for High-DOF Robots, Proceedings of the 54th Annual Design Automation Conference (DAC) 2017. [pdf]

  • Yungang Bao, Sa Wang, Labeled von Neumann Architecture for Software-Defined Cloud. Journal of Computer Science and Technology ((JCST)), 32(2): 219-223, 2017. (A position paper) [pdf]

  • Xusheng Zhan, Yungang Bao, Christian Bienia, Kai Li, PARSEC3.0: A Multicore Benchmark Suite with Network Stacks and SPLASH-2X, SIGARCH Computer Architecture News (CAN) 44(5): 1-16, 2016 [pdf]

  • Jiuyue Ma, Xiufeng Sui, Ninghui Sun, Yupeng Li, Zhihao Yu, Bowen Huang, Tiani Xu, Zhicheng Yao, Yun Chen, Haibin Wang, Lixing Zhang, Yungang Bao, Supporting Differentiated Services in Computers via Programmable Architecture for Resourcing-on-Demand (PARD) , in the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2015. [pdf] [slides]

  • Zehan Cui, Sally A. McKee, Zhongbin Zha, Yungang Bao, Mingyu Chen, DTail: A Flexible Approach to DRAM Refresh Management , to appear in ACM International Conference on Supercomputing (ICS), 2014. [pdf]

  • Lei Liu, Yong Li, Zehan Cui, Yungang Bao, Mingyu Chen, Chengyong Wu, Going Vertical in Memory Management: Handling Multiplicity by Multi-policy , to appear in the 41st International Symposium on Computer Architecture (ISCA), 2014. [pdf]

  • Rui Ren, Jiuyue Ma, Xiufeng Sui, and Yungang Bao, D^2P: A Distributed Deadline Propagation Approach to Tolerate Long-Tail Latency in Datacenters , appear in 5th ACM Asia-Pacific Workshop on Systems (APSys), 2014. [pdf]

  • Tianshi Chen, Qi Guo, Olivier Temam, Yue Wu, Yungang Bao, Zhiwei Xu, and Yunji Chen, Statistical Performance Comparisons of Computers, to appear in IEEE Transactions on Computers (IEEE TC), 2014. [pdf]

  • Zehan Cui, Licheng Chen, Yungang Bao, Mingyu Chen, A Swap-based Cache Set Index Scheme to Leverage both Superpage and Page Coloring Optimizations , to appear in the Design Automation Conference (DAC), 2014. [pdf]

  • Licheng Chen, Zhipeng Wei, Zehan Cui, Mingyu Chen, Haiyang Pan, Yungang Bao, CMD: Classification-based Memory Deduplication through Page Access Characteristics , in the 10th ACM SIGOPS/SIGPLAN International Conference on Virtual Execution Environments (VEE), 2014. [pdf]

  • Lei Liu, Zehan Cui, Yong Li, Yungang Bao, Mingyu Chen, Chengyong Wu, BPM/BPM+: Software-based Memory Partitioning Mechanisms for Eliminating DRAM Bank-/Channel-level Interferences in Multicore Systems , to appear in the ACM Transactions on Architecture and Code Optimization (TACO), 2014. [pdf]

  • Yongbing Huang, Licheng Chen, Zehan Cui, Yuan Ruan, Yungang Bao, Mingyu Chen, Ninghui Sun, HMTT: A Hybrid Hardware/Software Tracing System for Bridging the DRAM Access Trace's Semantic Gap , to appear in the ACM Transactions on Architecture and Code Optimization (TACO), 2014. [pdf]

  • Licheng Chen, Yanan Wang, Zehan Cui, Yongbing Huang, Yungang Bao, Mingyu Chen, Scattered Superpage: A Case for Bridging the Gap between Superpage and Page Coloring, Proceedings of the 31st IEEE International Conference on Computer Design (ICCD), Asheville, NC, 2013. [pdf]

  • Lei Liu, Zehan Cui, Mingjie Xing, Yungang Bao, Mingyu Chen, Chengyong Wu, A Software Memory Partition Approach for Eliminating Bank-level Interference in Multicore Systems, International Conference on Parallel Architectures and Compilation Techniques (PACT), 2012. [pdf]

  • Yongbing Huang, Zehan Cui, Licheng Chen, Wenli Zhang, Yungang Bao, Mingyu Chen, HaLock: Hardware-Assisted Lock Contention Detection in Multithreaded Applications, International Conference on Parallel Architectures and Compilation Techniques (PACT), 2012. [pdf]

  • Licheng Chen, Zehan Cui, Yongbing Huang, Yungang Bao, Guangming Tan, Mingyu Chen, A Lightweight Hybrid Hardware/Software Approach for Object-Relative Memory Profiling, IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), New Brunswick, NJ, April 1-3, 2012. [pdf], [ppt]

  • Erlin Yao, Yungang Bao, Mingyu Chen, What Hill-Marty model learn from and break through Amdahl's law?, Information Processing Letters (IPL), 2011.[pdf]

  • Guangming Tan, Linchuan Li, Sean Triechle, Everett Phillips, Yungang Bao, Ninghui Sun, Fast Implementation of DGEMM on Fermi GPU, ACM/IEEE Supercomputing (SC), 2011. [pdf]

  • Dan Tang, Yungang Bao, Weiwu Hu, Mingyu Chen, DMA Cache: Using On-Chip Storage to Architecturally Separate I/O Data from CPU Data for Improving I/O Performance, the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2010. [pdf],[ppt]

  • Erlin Yao, Yungang Bao, Guangming Tan, Mingyu Chen, Extending Amdahl's Law in the Multicore Era, ACM SIGMETRICS Performance Evaluation Review (PER), Volume 37 , Issue 2, September 2009. [pdf]

  • Yungang Bao, Mingyu Chen, Yuan Ruan, Li Liu, Jianping Fan, Qingbo Yuan, Bo Song, Jianwei Xu, HMTT: A Platform Independent Full-System Memory Trace Monitoring System, International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS),Annapolis, Maryland, USA, June 2-6,2008.[pdf]


  • Building systems that will fail.

  • "It almost goes without saying that ambitious systems never quite work as expected. Things usually go wrong, sometimes in dramatic ways."

    -- Fernando J. Corbató   

  • Building systems that really work.

  • "We built an initial prototype, putting in the first 90% of the effort required to create a real system and ... to make INGRES really work."

    -- Michael Stonebraker   

  • My cup of tea: Berkeley Hardware Prototypes

My Princess
of Violin
  • Lori was practicing Dance of the Little Swans with her teacher Ms. Yang.

  • Lori won the Golden Prize of 7 age group in Beijing Violin performance competition.

  • Lori won the First Prize of 7-9 age group in the 2018 Hong Kong International Violin Competition (Beijing Area).

Last updated: July 7, 2019.