Bulk Multicore Architecture Relieves Programmer Burden in Parallel Architectures

12/18/2009

With his Bulk Multicore Architecture, Torrellas proposes using hardware architectures to make parellel programming easier.

Written by

University of Illinois computer science professor Josep Torrellas demonstrates that easing a programmer’s burden in parallel computing does not compromise system performance or increase the complexity of hardware implementation in an article in the December 2009 issue of Communications of the ACM.

Josep Torrellas
Josep Torrellas
Josep Torrellas

 

In the article, Torrellas details his Bulk Multicore Architecture and calls for a change to the way in which multicore architectures are designed. 

“While the computer science and engineering community has frequently focused on advancing the technology for parallel processing, this time around the stakes are truly high,” says Torrellas.  “There is no other obvious route to higher computing performance than through parallelism.”

Torrellas calls for breakthroughs in all layers of the computing stack, including languages, programming models, compilation and runtime software, programming and debugging tools, and hardware architectures.

Torrellas takes aim at the hardware architecture challenges with his Bulk Multicore Architecture, a novel general-purpose multicore architecture.  Torrellas designed the system specifically to address the complexity of parallel programming.  He proposes using the hardware architecture to relieve programmers (and runtime systems) of the burden of managing data sharing in parallel environments, as well as providing new hardware-supported mechanisms to minimize programming errors.

The system eliminates one of the traditional tenets of processor architecture, namely the need to commit instructions in order, providing the architectural state of the processor after each instruction.

In the Bulk Multicore Architecture, the default execution mode of a processor is to commit chunks of instructions at a time.  Torrellas explains, “Such a chunked mode of execution and commit is a hardware-only mechanism, invisible to the software running on the processor.  Moreover, its purpose is not to parallelize a thread, but to improve programmability and performance.”  This invisibility to the software removes programmer restrictions as to the choice of programming model, language, or runtime system.

Importantly, Torrellas is able to demonstrate that these programmability advantages do not come at the expense of performance.  Furthermore, Torrellas explains that not only does Bulk Multicore reduce complexity of parallel programming, but that it also reduces hardware complexity in multiprocessor environments.  In fact, the system requires simpler processor hardware than current machines.

The idea of making parallel computing simple is at the core of the Illinois Universal Parallel Computing Research Center’s research agenda.  UPCRC Illinois is a joint research effort of the Illinois department of computer science and the Coordinated Science Laboratory, with funding from corporate partners Microsoft and Intel.

Torrellas and his team plan to expand their work on Bulk Multicore in several ways.  The team will be examining the scalability of the chunk commit model, as well as how the model can enable efficient support for new program-development and debugging tools, aggressive autotuners and compilers, and even novel programming models.

Read the complete Communications of the ACM article at http://mags.acm.org/communications/200912/ (page 58, “The Bulk Multicore Architecture for Improved Programmability).
 


Share this story

This story was published December 18, 2009.