11/30/2011
Written by
University of Illinois computer science researchers have won the Best Paper Award at the Parallel Architectures and Compilation Techniques (PACT 2011) conference for their paper, “DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism.” The research team, led by Illinois computer science professor Sarita Adve, is working on the DeNovo project that takes a new approach to building multicore hardware. DeNovo exploits emerging software trends in disciplined parallel programming to make hardware simpler, higher performance, and lower energy, all at the same time.
Most multicore programs use a shared memory programming model. Shared-memory programs have many advantages, but they are known to be difficult to program, debug, and maintain. At the same time, shared-memory hardware is complex and inefficient, leading to unnecessary energy consumption and performance bottlenecks. After decades of trying, researchers have found it difficult to even develop satisfactory shared-memory semantics for common languages such as C++ and Java. A recent article, co-authored by Adve, calls the research community to rethink how we design both parallel languages and parallel hardware.
At the root of these problems is what the Illinois team refers to as “wild shared memory” behavior. Shared-memory programs tend to exhibit unstructured parallelism with implicit communication and side effects, leading to hard-to-debug data races and ubiquitous non-determinism. The Illinois team believes general-purpose languages must enforce more discipline and eliminate such wild behavior by design if parallel computing is to become tractable for mass programmers. Such a discipline would enforce more structured parallelism and make side effects of a parallel task be more explicit. Many software researchers today are working on such an approach, including pioneering work by an Illinois team on the Deterministic Parallel Java (DPJ) language, led by Vikram Adve.
The DeNovo team, working closely with the DPJ team, has shown that the same disciplined parallel programming features that simplify software can also enable more performance-, energy-, and complexity-scalable hardware. As their first step, they have developed a cache coherence protocol and consistency model that takes an order of magnitude less time to verify and runs some applications in less than half the time with less than half the network traffic and cache misses than the state-of-the-art. The simplicity and low network and cache traffic means that the performance increases come with significant power and energy benefits. It is rare in computer architecture that a hardware design improves complexity, performance, and power all at once.
According to Sarita Adve, “this paper is a first step towards an ambitious vision. While it presents significant new technical contributions that we hope will eventually be adopted, it also opens up many new questions. We hope that the largest impact of this paper will be in inspiring a broad research agenda anchored in a more disciplined approach to parallel systems. The paper motivates hardware research driven by disciplined programming models and also seeks to inspire architects to extend their influence on the development of such models. We believe this is an opportune time for such a co-designed evolution of the current hardware-software ecosystem.”
The paper was authored by a team of Illinois computer science graduate students and faculty, including Byn Choi, Rakesh Komuravelli, Hyojin Sung, Robert Smolinski, Nima Honarmand, Sarita Adve, and Vikram Adve, in collaboration with Nicholas Carter and Ching-Tsun Chou from Intel. The work was supported in part by Microsoft and Intel through the Universal Parallel Computing Research Center and the Illinois-Intel Parallelism Center, and by the National Science Foundation.