0% found this document useful (0 votes)
292 views

Accelerating Computational Science and Engineering

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
292 views

Accelerating Computational Science and Engineering

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

PARALLEL COMPUTING:

ACCELERATING COMPUTATIONAL
SCIENCE AND ENGINEERING (CSE)
Advances in Parallel Computing
This book series publishes research and development results on all aspects of parallel computing.
Topics may include one or more of the following: high-speed computing architectures (Grids,
clusters, Service Oriented Architectures, etc.), network technology, performance measurement,
system software, middleware, algorithm design, development tools, software engineering,
services and applications.

Series Editor:
Professor Dr. Gerhard R. Joubert

Volume 25
Recently published in this series
Vol. 24. E.H. D’Hollander, J.J. Dongarra, I.T. Foster, L. Grandinetti and G.R. Joubert (Eds.),
Transition of HPC Towards Exascale Computing
Vol. 23. C. Catlett, W. Gentzsch, L. Grandinetti, G. Joubert and J.L. Vazquez-Poletti (Eds.),
Cloud Computing and Big Data
Vol. 22. K. De Bosschere, E.H. D’Hollander, G.R. Joubert, D. Padua and F. Peters (Eds.),
Applications, Tools and Techniques on the Road to Exascale Computing
Vol. 21. J. Kowalik and T. Puźniakowski, Using OpenCL – Programming Massively Parallel
Computers
Vol. 20. I. Foster, W. Gentzsch, L. Grandinetti and G.R. Joubert (Eds.), High Performance
Computing: From Grids and Clouds to Exascale
Vol. 19. B. Chapman, F. Desprez, G.R. Joubert, A. Lichnewsky, F. Peters and T. Priol (Eds.),
Parallel Computing: From Multicores and GPU’s to Petascale
Vol. 18. W. Gentzsch, L. Grandinetti and G. Joubert (Eds.), High Speed and Large Scale
Scientific Computing
Vol. 17. F. Xhafa (Ed.), Parallel Programming, Models and Applications in Grid and P2P
Systems
Vol. 16. L. Grandinetti (Ed.), High Performance Computing and Grids in Action
Vol. 15. C. Bischof, M. Bücker, P. Gibbon, G.R. Joubert, T. Lippert, B. Mohr and F. Peters
(Eds.), Parallel Computing: Architectures, Algorithms and Applications

Volumes 1–14 published by Elsevier Science.

ISSN 0927-5452 (print)


ISSN 1879-808X (online)
Parallel Com
P mputing
g:
Acceleratin
ng Com
mputattional
S
Science
e and Engin
E neeringg (CSEE)

Edited by
y
Michael Baader
Technissche Universsität München
n, Munich, Germany
G

A
Arndt Bod
de
Leibnizz Supercomp
puting Centree, Munich, Germany
G

Hans-Jo
oachim Bungartz
B
Technissche Universsität München
n, Munich, Germany
G

Micchael Gerrndt
Technissche Universsität München
n, Munich, Germany
G

Gerhard R. Jo
oubert
Teechnical Univversity Claussthal, Germa
any
and
Frrans Peteers
Philips Research,
R Neetherlands

Amstterdam • Berrlin • Tokyo • Washington, DC


© 2014 The authors and IOS Press.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, without prior written permission from the publisher.

ISBN 978-1-61499-380-3 (print)


ISBN 978-1-61499-381-0 (online)
Library of Congress Control Number: 2014932893

Publisher
IOS Press BV
Nieuwe Hemweg 6B
1013 BG Amsterdam
Netherlands
fax: +31 20 687 0019
e-mail: order@iospress.nl

Distributor in the USA and Canada


IOS Press, Inc.
4502 Rachael Manor Drive
Fairfax, VA 22032
USA
fax: +1 703 323 3668
e-mail: iosbooks@iospress.com

LEGAL NOTICE
The publisher is not responsible for the use which might be made of the following information.

PRINTED IN THE NETHERLANDS


Parallel Computing: Accelerating Computational Science and Engineering (CSE) v
M. Bader et al. (Eds.)
IOS Press, 2014
© 2014 The authors and IOS Press. All rights reserved.

Preface
This volume of the series “Advances in Parallel Computing” contains the proceedings
of the International Conference on Parallel Programming – ParCo 2013 – held from 10
to 13 September 2013 in Garching, Germany. The conference was hosted by the
Technische Universität München (Department of Informatics) and the Leibniz Super-
computing Centre.
With ParCo 2013, the biennial ParCo conference series now looks back at 30 years
of top-level research in parallel algorithms, architectures and applications. It has finally
entered an era in which parallel computing – for many years the enabling technology of
high-end machines – is now ubiquitous and the key for the efficient use of any kind of
computer architecture: from embedded and personal up to exascale systems.
The trend towards heterogeneous architectures, multiple levels of parallelism and
towards higher and higher core numbers of supercomputing platforms, which was al-
ready addressed in the previous ParCo instances, can now be seen in full bloom. Paral-
lel programming models for multi- and manycore CPUs, GPUs, FPGAs, and heteroge-
neous platforms have been one of the clear focal points at ParCo 2013. In addition,
performance engineering processes, including analysis, tools and metrics, must be
adapted to these new and innovative platforms. It also becomes apparent from the con-
tributions that novel numerical algorithms are required: for basic tasks in numerical
linear algebra as well as for adaptive or space-time parallel simulations. Most important,
all these aspects need to be combined in the parallelisation and optimisation of large-
scale applications, in order to make parallel computing – including the processing of
large data sets (“Big Data”) – a persistent driver of research in many fields of science
and engineering.
ParCo 2013 strongly profited from its 12 mini-symposia (including an industry
session and a special PhD Symposium), which represented and intensified the discus-
sion of current “hot topics” in high performance and parallel computing in an excellent
manner. At least three mini-symposia were dedicated to large-scale supercomputing, in
particular. Three mini-symposia focused on novel challenges arising from parallel ar-
chitectures (multi-/manycore, heterogeneous platforms, FPGAs). A further mini-
symposium hotspot was established by the “multi”-challenges: multi-level algorithms
as well as multi-scale, multi-physics and multi-dimensional problems.
We would like to express our sincerest thanks to ParCo’s four keynote speakers –
Pete Beckman, Sudip Dosanjh, Wolfgang Nagel and Martin Schulz – who, in their
presentations, gave an exciting overview of both promises and challenges for the age of
exascale and Big Data. We are equally obliged to all presenters at the conference, all
authors and co-authors who contributed to these proceedings, and of course to all at-
tendees at ParCo 2013 – all of them contributed to the excellent scientific quality of the
vi

conference and to its inspiring atmosphere. Last, but definitely not least, special thanks
go to all (co-)organisers, including the mini-symposium organisers, to the members of
the international programme committee, and to all persons who assisted during the con-
ference.

Michael Bader
Arndt Bode
Hans-Joachim Bungartz
Michael Gerndt
Gerhard R. Joubert
Frans Peters

Date: 2013-12-01
vii

Conference Committee
Gerhard Joubert (Germany/Netherlands) (Conference Chair)
Michael Bader (Germany)
Arndt Bode (Germany)
Hans-Joachim Bungartz (Germany)
Michael Gerndt (Germany)
Frans Peters (Netherlands)

Advisory Committee
Thomas Lippert (Germany)
Thierry Priol (France)
Koen De Bosschere (Belgium)
Jack Dongarra (USA)

Minisymposium Committee
Tobias Weinzierl (Germany)
Miriam Mehl (Germany)

Organising & Exhibition Committee


Michael Bader (Germany)
Arndt Bode (Germany)
Hans-Joachim Bungartz (Germany)
Michael Gerndt (Germany)
Houssam Haitof (Germany)
Herbert Huber (Germany)
Carsten Trinitis (Germany)
Josef Weidendorfer (Germany)

Finance Committee
Frans Peters (Netherlands)
viii

Conference Programme Committee


Arndt Bode (Germany) (Chair)
Michael Bader (Germany) (Chair)
Rosa Badia (Spain) (Co-Chair)

Peter Arbenz (Switzerland) Bettina Krammer (France)


Pete Beckman (USA) Dieter Kranzlmüller (Germany)
Mark Bull (UK) Herbert Kuchen (Germany)
Andrea Clematis (Italy) Alexey Lastovetsky (Ireland)
Luisa D’Amore (Italy) Jin-Fu Li (Taiwan)
Erik D’Hollander (Belgium) Bernd Mohr (Germany)
Michel Dayde (France) Wolfgang E. Nagel (Germany)
Bjorn De Sutter (Belgium) Victor Pankratius (USA)
Frank Dehne (Canada) Christian Pérez (France)
Paul Feautrier (France) Oscar Plata (Spain)
Basilio Fraguela (Spain) Sabri Pllana (Austria)
Franz Franchetti (USA) Thierry Priol (France)
Efstratios Gallopoulos (Greece) Enrique Quintana-Ort (Spain)
William Gropp (USA) J. (Ram) Ramanujam (USA)
David Ham (UK) Dirk Roose (Belgium)
Torsten Hoefler (Switzerland) Gudula Rünger (Germany)
Lei Huang (USA) Peter Sanders (Germany)
Thomas Huckle (Germany) Martin Schulz (USA)
Hai Jin (China) Dirk Stroobandt (Belgium)
Wolfgang Karl (Germany) Tor Sørevik (Norway)
Christoph Kessler (Sweden) Domenico Talia (Italy)
Harald Köstler (Germany) Paco Tirado (Spain)
Markus Kowarschik (Germany) Denis Trystram (France)
ix

Programme Committees of Mini-Symposia

ParCo 2013 PhD Symposium

Josef Weidendorfer (Symposium Chair) (Germany)


Michael Bader (Symposium Co-Chair) (Germany)

Jens Breitbart (Germany)


Carsten Burstedde (Germany)
Karl Fürlinger (Germany)
Rainer Keller (Germany)
Harald Köstler (Germany)
Dirk Pflüger (Germany)
Martin Schulz (USA)
Carsten Trinitis (Germany)

ParaFPGA-2013: Parallel Computing with FPGA’s


Erik H. D’Hollander (Symposium Chair) (Belgium)
Dirk Stroobandt (Programme Committee Chair) (Belgium)
Abdellah Touhafi (Programme Committee Co-Chair) (Belgium)

Abbes Amira (United Kingdom)


Georgi Gaydadjiev (Netherlands)
Mike Hutton (USA)
Tsutomu Maruyama, (Japan)
Dionisios Pnevmatikos (Greece)
Viktor Prasanna (USA)
Mazen A.R. Saghir (Qatar)
Donatella Sciuto (Italy)
Sascha Uhrig (Germany)
Sotirios G. Ziavras (USA)

High-Dimensional Meets Parallel – Algorithms and Applications

Dirk Pflüger (Symposium Chair) (Germany)


Hans-Joachim Bungartz (Symposium Co-Chair) (Germany)
Markus Hegland (Symposium Co-Chair) (Australia)
x

Application Autotuning for HPC (Architectures)

Siegfried Benkner (Austria)


Matthias Brehm (Germany)
Michael Gerndt (Germany)
Wolfram Hesse (Germany)
Anna Sikora (Spain)

Extreme Scaling on SuperMUC

Ferdinand Jamitzky (Symposium Chair) (Germany)


Nikolay Hammer (Symposium Co-chair) (Germany)
Helmut Satzger (Symposium Co-chair) (Germany)

Parallel Programming for Heterogeneous Architectures


Bettina Krammer (Symposium Chair) (Germany)
Hartmut Mix (Symposium Co-chair) (Germany)
Markus Geimer (Symposium Co-chair) (Germany)

DECI Minisymposium
(PRACE – Partnership for Advanced Computing in Europe)
Chris Johnson (Symposium Chair) (United Kingdom)

Efficient Highly Scalable Multi-level Preconditioners


for Linear Systems
Matthias Bolten (Symposium Chair) (Germany)

Performance Modeling and Engineering


for Multi-/Many-Core Architectures
Gerhard Wellein (Symposium Chair) (Germany)

Space-filling Curves in Parallel Computing

Dirk Roose (Symposium Chair) (Belgium)


Michael Bader (Germany)
Tobias Weinzierl (Germany)
xi

Interaction and HPC: Multi-Scale/Multi-Physics Applications

Ralf-Peter Mundani (Symposium Chair) (Germany)

ParCo2013 Sponsors

AMD – Advanced Micro Devices, Inc.


Bull
EUROTECH S.p.A.
EXTOLL GmbH
FUJITSU
IBM Deutschland GmbH
MEGWARE Computer Vertrieb und Service GmbH
NVIDIA Corporation
GWT-TUD GmbH (Vertriebspartner VAMPIR)
xiii

Contents
Preface v
Michael Bader, Arndt Bode, Hans-Joachim Bungartz, Michael Gerndt,
Gerhard R. Joubert and Frans Peters
Conference Organisation vii

Invited Talks

Extreme Data Science at the National Energy Research Scientific Computing


(NERSC) Center 3
Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan,
Richard Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen,
David Skinner and Nicholas J. Wright
Performance Analysis Techniques for the Exascale Co-Design Process 19
Martin Schulz, Jim Belak, Abhinav Bhatele, Peer-Timo Bremer,
Greg Bronevetsky, Marc Casas, Todd Gamblin, Katherine E. Isaacs,
Ignacio Laguna, Joshua Levine, Valerio Pascucci, David Richards
and Barry Rountree

Parallel Programming Models

XMP-IO Function and Its Application to MapReduce on the K Computer 35


Tomotake Nakamura and Mitsuhisa Sato
POLCA – A Programming Model for Large Scale, Strongly Heterogeneous
Infrastructures 43
Lutz Schubert, Jan Kuper and José Gracia
Exploitation of Quality/Throughput Tradeoffs in Image Processing Through
Invasive Computing 53
Alexandru Tanase, Vahid Lari, Frank Hannig and Jürgen Teich
An Efficient Thread Mapping Strategy for Multiprogramming on Manycore
Processors 63
Ashkan Tousimojarad and Wim Vanderbauwhede
A Scalable Farm Skeleton for Heterogeneous Parallel Programming 72
Steffen Ernsting and Herbert Kuchen
Towards Truly Boolean Arrays in Data-Parallel Array Processing 82
Clemens Grelck and Hraban Luyat
Deep Packet Inspection on Commodity Hardware Using FastFlow 92
M. Danelutto, L. Deri, D. De Sensi and M. Torquati
xiv

Performance Analysis and Tools

Formalizing Bottlenecks in Task-Based OpenMP Applications 103


Shajulin Benedict, Michael Gerndt and Diana-Mihaela Gudu
Characterizing Performance of Applications on Blue Gene/Q 113
Paul F. Baumeister, Hans Boettiger, Thorsten Hater, Michael Knobloch,
Thilo Maurer, Andrea Nobile, Dirk Pleiter and Nicolas Vandenbergen
Specification of Periscope Tuning Framework Plugins 123
Robert Mijaković, Antonio Pimenta Soto, Isaías A. Comprés Ureña,
Michael Gerndt, Anna Sikora and Eduardo César

Parallel Numerical Linear Algebra

On Using Speculative Computations for Parallel Reduction to Tridiagonal Form 135


Sergey V. Kuznetsov
Fast Approximate Solution of the Non-Symmetric Generalized Eigenvalue
Problem on Multicore Architectures 143
Peter Benner, Martin Köhler and Jens Saak
Locality Optimization on a NUMA Architecture for Hybrid LU Factorization 153
Adrien Rémy, Marc Baboulin, Masha Sosonkina and Brigitte Rozoy
Variable Block Algebraic Recursive Multilevel Solver (VBARMS) for Sparse
Linear Systems 163
Bruno Carpentieri, Jia Liao and Masha Sosonkina
A Proposal of a Single-Synchronized Solver Suited to Large Scale Linear
Systems on Parallel Computers with Distributed Memory 173
Seiji Fujino, Keiichi Murakami and Kosuke Iwasato
Approximate Inverse Preconditioners for Krylov Methods on Heterogeneous
Parallel Computers 183
Daniele Bertaccini and Salvatore Filippone
Cache and Energy Efficiency of Sparse Matrix-Vector Multiplication
for Different BLAS Numerical Types with the RSB Format 193
Michele Martone
Heterogeneous Sparse Matrix Computations on Hybrid GPU/CPU Platforms 203
Valeria Cardellini, Alessandro Fanfarillo and Salvatore Filippone

Parallel Algorithms

MapReduce Streaming Algorithms for Laplace Relaxation on the Cloud 215


Atanas Radenski and Boyana Norris
Space Exploration Using Parallel Orbits: A Study in Parallel Symbolic
Computing 225
Vladimir Janjic, Christopher Brown, Max Neunhöffer, Kevin Hammond,
Steve Linton and Hans-Wolfgang Loidl
xv

SFC-Based Communication Metadata Encoding for Adaptive Mesh Refinement 233


Martin Schreiber, Tobias Weinzierl and Hans-Joachim Bungartz
Graph Repartitioning with Both Dynamic Load and Dynamic Processor
Allocation 243
Clément Vuchener and Aurélien Esnard
ForestClaw: Hybrid Forest-of-Octrees AMR for Hyperbolic Conservation Laws 253
Carsten Burstedde, Donna Calhoun, Kyle Mandli and Andy R. Terrel
A Space-Time Parallel Solver for the Three-Dimensional Heat Equation 263
Robert Speck, Daniel Ruprecht, Matthew Emmett, Matthias Bolten
and Rolf Krause
An Efficient Pipelined Implementation of Space-Time Parallel Applications 273
Toshiya Takami and Daiki Fukudome

GPU Computing and Applications

Efficient GPU-Based Optimization of Volume Meshes 285


Eric Shaffer, Zuofu Cheng, Raine Yeh, George Zagaris and Luke Olson
Fast Uniform Grid Construction on GPGPUs Using Atomic Operations 295
Davide Barbieri, Valeria Cardellini and Salvatore Filippone
Porting Large HPC Applications to GPU Clusters: The Codes GENE
and VERTEX 305
Tilman Dannert, Andreas Marek and Markus Rampp
Numerical Simulation of the Low Compressible Viscous Gas Flows
on GPU-Based Hybrid Supercomputers 315
Alexander A. Davydov and Evgeny V. Shilnikov
Simulation of Multiphase Flows in the Subsurface on GPU-Based
Supercomputers 324
Marina Trapeznikova, Natalia Churbanova, Anastasiya Lyupa
and Dmitry Morozov
Atomic Computing – A Different Perspective on Massively Parallel Problems 334
Andrew Brown, Rob Mills, Jeff Reeve, Kier Dugan and Steve Furber

Parallelisation and Optimisation of Large-Scale Applications

Accelerating SeisSol by Generating Vectorized Code for Sparse Matrix


Operators 347
Alexander Breuer, Alexander Heinecke, Michael Bader
and Christian Pelties
Experience with the MPI/STARSS Programming Model on a Large Production
Code 357
Dirk Brömmel, Paul Gibbon, Marta Garcia, Víctor López,
Vladimir Marjanović and Jesús Labarta
xvi

Exploiting Data- and Task-Parallelism in the Solution of Riccati Equations


on Multicore Servers and GPUs 367
P. Benner, P. Ezzatti, E.S. Quintana-Ortí and A. Remón
Testing and Implementing Some New Algorithms Using the FFTW Library
on Massively Parallel Supercomputers 375
Massimiliano Guarrasi, Ning Li, Sandro Frigio, Andrew Emerson
and Giovanni Erbacci
Performance Measurements of MHD Simulation for Planetary Magnetosphere
on Peta-Scale Computer FX10 387
Keiichiro Fukazawa, Takeshi Nanri and Takayuki Umeda
Parallel Simulations of Self-Propelled Microorganisms 395
Kristina Pickl, Matthias Hofmann, Tobias Preclik, Harald Köstler,
Ana-Sunčana Smith and Ulrich Rüde
Improving Communication Performance of Sparse Linear Algebra
for an Atomistic Simulation Application 405
Christiane Pousa, Jürg Hutter and Joost Vandevondele
NEMORB’s Fourier Filter and Distributed Matrix Transposition on Petaflop
Systems 415
Tiago Ribeiro and Matthieu Haefele
Parallel Computing Design for Exact Diagonalization Scheme on Multi-Band
Hubbard Cluster Models 427
Susumu Yamada, Toshiyuki Imamura and Masahiko Machida

ParCo PhD Symposium

ParCo 2013 PhD Symposium 439


Josef Weidendorfer and Michael Bader
Numerical Experiments with New Algorithms for Parallel Decomposition
of Large Computational Meshes 441
Evdokia Golovchenko, Elizaveta Dorofeeva, Irina Gasilova
and Alexey Boldarev
A Distributed Algorithm for the Permutation Flow Shop Problem –
An Empirical Analysis 451
Samia Kouki, Mohamed Jemni and Talel Ladhari
GPI2 for GPUs: A PGAS Framework for Efficient Communication in Hybrid
Clusters 461
Lena Oden
A Fault Tolerant Implementation of Multi-Level Monte Carlo Methods 471
Stefan Pauli, Manuel Kohler and Peter Arbenz
High Performance CPU/GPU Multiresolution Poisson Solver 481
Wim M. Van Rees, Diego Rossinelli, Panagiotis Hadjidoukas
and Petros Koumoutsakos
xvii

Mini-Symposium “Parallel Computing with FPGAs (ParaFPGA2013)”

ParaFPGA 2013: Harnessing Programs, Power and Performance in Parallel


FPGA Applications 493
Erik H. D’Hollander, Dirk Stroobandt and Abdellah Touhafi
High-Level Synthesis Revised: Generation of FPGA Accelerators
from a Domain-Specific Language Using the Polyhedron Model 497
Moritz Schmid, Frank Hannig, Alexandru Tanase and Jürgen Teich
Compiling a Dataflow-Based Language Abstraction onto an FPGA 507
Eva Burrows
Timing Driven C-Slow Retiming on RTL for MultiCores on FPGAs 515
Tobias Strauch
Performance and Resource Modeling for FPGAs Using High-Level Synthesis
Tools 523
Bruno Da Silva, An Braeken, Erik H. D’Hollander and Abdellah Touhafi
Interactive Graph Cuts Using FPGA 532
Daichi Kobori and Tsutomu Maruyama
An Image Filter System Based on Dynamic Partial Reconfiguration on FPGA 540
Hisaaki Kurita and Tsutomu Maruyama
Investigating Energy Consumption of an SRAM-Based FPGA for Duty-Cycle
Applications 548
Khurram Shahzad and Bengt Oelmann

Mini-Symposium “High-Dimensional Meets Parallel – Algorithms


and Applications”

High-Dimensional Meets Parallel: Algorithms and Applications 563


Hans-Joachim Bungartz, Dirk Pflüger and Markus Hegland
Global Communication Schemes for the Sparse Grid Combination Technique 564
Philipp Hupp, Riko Jacob, Mario Heene, Dirk Pflüger
and Markus Hegland
Load Balancing for Massively Parallel Computations with the Sparse Grid
Combination Technique 574
Mario Heene, Christoph Kowitz and Dirk Pflüger
A Parallel Fault Tolerant Combination Technique 584
Brendan Harding and Markus Hegland
Managing Complexity in the Parallel Sparse Grid Combination Technique 593
J.W. Larson, P.E. Strazdins, M. Hegland, B. Harding, S. Roberts, L. Stals,
A.P. Rendell, Md.M. Ali and J. Southern
Scalability and Fault Tolerance of the Alternating Direction Method
of Multipliers for Sparse Grids 603
Valeriy Khakhutskyy, Dirk Pflüger and Markus Hegland
xviii

Mini-Symposium “Application Autotuning for HPC (Architectures)”

Mini-Symposium on Application Autotuning for HPC 615


Siegfried Benkner, Matthias Brehm, Michael Gerndt, Wolfram Hesse
and Anna Sikora
Investigating Performance Benefits from OpenACC Kernel Directives 616
Benjamin Eagan, Gilles Civario and Renato Miceli
Application-Independent Autotuning for GPUs 626
Martin Tillmann, Thomas Karcher, Carsten Dachsbacher
and Walter F. Tichy
Autotuning of Pattern Runtimes for Accelerated Parallel Systems 636
Enes Bajrovic, Siegfried Benkner, Jiri Dokulil and Martin Sandrieser
Empirical Performance Modeling of GPU Kernels Using Active Learning 646
Prasanna Balaprakash, Karl Rupp, Azamat Mametjanov,
Robert B. Gramacy, Paul D. Hovland and Stefan M. Wild
Crowdtuning: Systematizing Auto-Tuning Using Predictive Modeling
and Crowdsourcing 656
Abdul Memon and Grigori Fursin
Autotuning the Energy Consumption 668
Carmen B. Navarrete, Carla Guillen, Wolfram Hesse and Matthias Brehm
Potentials and Limitations for Energy Efficiency Auto-Tuning 678
Robert Schöne, Andreas Knüpfer and Daniel Molka

Mini-Symposium “Extreme Scaling on SuperMUC”

Extreme Scaling Workshop at the LRZ 691


Momme Allalen, Gurvan Bazin, Christoph Bernau, Arndt Bode,
David Brayford, Matthias Brehm, Jürg Diemand, Klaus Dolag,
Jan Engels, Nicolay Hammer, Herbert Huber, Ferdinand Jamitzky,
Anupam Kamakar, Carsten Kutzner, Andreas Marek, Carmen Navarrete,
Helmut Satzger, Wolfram Schmidt and Philipp Trisjono
Extreme Scaling of Lattice Quantum Chromodynamics 698
David Brayford, Momme Allalen and Volker Weinberg
End-to-End Parallel Simulations with APES 703
Harald Klimach, Kartik Jain and Sabine Roller
Towards Petaflops Capability of the VERTEX Supernova Code 712
Andreas Marek, Markus Rampp, Florian Hanke and Hans-Thomas Janka
Scaling of the GROMACS 4.6 Molecular Dynamics Code on SuperMUC 722
Carsten Kutzner, Rossen Apostolov, Berk Hess and Helmut Grubmüller
xix

Mini-Symposium “Parallel Programming for Heterogeneous Architectures”

Parallel Programming for Heterogeneous Architectures 731


Bettina Krammer, Hartmut Mix and Markus Geimer
Execution Schemes for the NPB-MZ Benchmarks on Hybrid Architectures:
A Comparative Study 733
Jörg Dümmler and Gudula Rünger
Scilab on a Hybrid Platform 743
Victor Lomüller, Sylvestre Ledru and Henri-Pierre Charles
Divide and Conquer Parallelization of Finite Element Method Assembly 753
Loïc Thébault, Eric Petit, Marc Tchiboukdjian, Quang Dinh
and William Jalby
Cudagrind: A Valgrind Extension for CUDA 763
Thomas M. Baumann and José Gracia
Profiling Hybrid HMPP Applications with Score-P on Heterogeneous Hardware 773
Marc Schlütter, Peter Philippen, Laurent Morin, Markus Geimer
and Bernd Mohr
Binary Instrumentation for Scalable Performance Measurement of OpenMP
Applications 783
Julien Jaeger, Peter Philippen, Eric Petit, Andres Charif Rubial,
Christian Rössel, William Jalby and Bernd Mohr
A Case Study: Holistic Performance Analysis on Heterogeneous Architectures
Using the Vampir Toolchain 793
Robert Dietrich, Frank Winkler, Thomas William, Jonas Stolle,
Robert Henschel and Donald K. Berry

Further Mini-Symposium Contributions

PRACE DECI (Distributed European Computing Initiative) Minisymposium 805


Chris Johnson, Anastasia V. Bochenkova, Alexander A. Granovsky,
Peter J. Bond, Teresa Paramo, Tristan Glatard, William A. Romero R.,
Denis Friboulet, Stefan J. Zasada and Peter V. Coveney
A Generic Prototype to Benchmark Algorithms and Data Structures
for Hierarchical Hybrid Grids 813
Sebastian Kuckuk, Björn Gmeiner, Harald Köstler and Ulrich Rüde
Towards a Performance Engineering Workflow for OpenMP 4.0 823
Dirk Schmidl, Christian Iwainsky, Christian Terboven, Christian H. Bischof
and Matthias S. Müller
Theoretical Measures of Cache Efficiency for Tetrahedral Adaptive Meshes.
A Case Study with a Quasi Space-Filling Curve Order 833
Oliver Kunst and Jörn Behrens

Author Index 843

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy