Workshop on Parallel Programming for Resilience and Energy Efficiency --
(to be held as part of Principles and Practice of Parallel Programming --
March 12-16, 2016
Nowadays, the number of components in High Performance Computing (HPC)
systems increases at the pace
dictated by Moore's Law, but the mean time between failures (MTBF) for the
complete system is significantly shrinking.
For example, when accounting for the instruction & data caches and register
files, the mean time
between soft errors for the Sequoia supercomputer at Lawrence Livermore
National Laboratory is estimated to be 1.5 days.
As HPC systems move into the Exascale era, the number of system components
will increase by up to
three orders of magnitude, and MTBF will further deteriorate, thus
promoting resilience into a fundamental
challenge. This scenario renders current system solutions to resilience,
such as coordinated checkpointing, unfeasible,
and motivates the use of algorithmic, programming model, or runtime system
approaches to improve
the resilience of parallel applications at scale.
While a resilience crisis is looming in the HPC domain, the end of
Dennard~scaling (i.e., the ability to shrink the feature size of integrated
maintaining a constant power density) has pushed energy consumption into a
primary design principle, in par with performance,
for which holistic solutions are currently necessary, from the hardware to
the application software.
The Green500 ranking, based on the LINPACK benchmark, shows remarkable
improvements in the MFLOPS/W (millions of floating-point
arithmetic operations per Joule)
of recent HPC facilities. However, with the cost of 1~MW being close to
$1~million, any improvement on this
metric will surely have an enormous positive impact on the deployment of
future Exascale systems. Despite a flurry of research
in recent years on techniques that improve the energy-efficiency of HPC
systems via software intervention, energy remains transparent to existing
parallel programming models used in production settings.
The quest for higher energy-efficiency in future HPC systems is inherently
connected to the quest for enhanced resilience for two reasons:
First, resilience techniques have a non-trivial energy cost. Second,
ongoing efforts to further improve the energy-efficiency of hardware
at the device level (such as operating hardware below its nominal margins
or replacing DDR technology with non-volatile memory technologies)
may compromise hardware reliability.
The purpose of this workshop is to explore the space of techniques for
improving the resilience and energy-efficiency (REE) of parallel programs
algorithmic and language levels. We are particularly interested in papers
that present cross-cutting techniques that trade energy-efficiency with
We solicit original papers that include but are not limited to the
* Programming languages, interfaces, and general software techniques for
* Scheduling and mapping for REE.
* Run-times for REE.
* Algorithmic techniques for REE.
* Programming models for computing paradigms that improve REE, such as
near-threshold computing, approximate computing, or neuromorphic computing.
* Applications and cases studies of success.
Papers should not exceed ten single-space double-column pages (including
figures, tables and references) using a 10-point font on 8.5x11-inch pages.
We suggest to use IEEE two-column template for conference proceedings.
Submissions will be judged based on correctness, originality, technical
strength, significance, presentation, quality and appropriateness.
Submitted papers should not have appeared in or be under consideration for
venue. A full peer-process will be followed with each paper being reviewed
by at least 3 members of the program committee. Submissions will be made
Submission of Papers: November 23, 2015.
Notification of Acceptance: January 5, 2016.
Workshop: March 12-16, 2016 (half day).
* Christos D. Antonopoulos, Electrical and Computer Engineering Department
of the University of Thessaly, Greece.
* Dimitrios S. Nikolopoulos, EEECS, at Queen's University of Belfast,
Northern Ireland, United Kingdom.
* Oscar Plata, Department of Computer Architecture at the University of
* Enrique S. Quintana-Orti, Department of Computer Engineering & Sciences,
Universidad Jaume~I of Castellon, Spain.
To be confirmed.
Extended versions of best papers will appear, after an additional review
process, in a special issue of Elsevier Parallel Computing journal.
To unsubscribe from the PODC list:
write to: mailto:[log in to unmask]
or click the following link: