[Pw_forum] First PWscf GPU-enabled beta-release
Ivan Girotto
ivan.girotto at ichec.ie
Thu May 5 18:11:27 CEST 2011
Dear QE users & developers,
We are happy to announce that the first beta GPU-enabled release of
Quantum ESPRESSO (QE) has been committed today in the official repository.
You can download the new version of the code using the following command:
$ svn checkout
svn://scm.qe-forge.org/scmrepos/svn/q-e/branches/espresso-PRACE
The Irish Centre for High-End Computing (ICHEC, www.ichec.ie
<http://www.ichec.ie>) has been mainly responsible for extending the QE
suite to enhance calculations on NVIDIA GPUs. The porting activity has
been supported within the PRACE 1st Implementation Phase project. It is
currently carried out through the Sub-task "Accelerator", led by ICHEC,
within the Work-Package "Programming Techniques for High-Performance
Applications" in collaboration with CINECA.
The porting activity is concerned mainly with the PWscf package. But
ICHEC and the Irish QE user community are interested in exploring any
other initiatives which come forward from QE users or developers
interested in porting on GPGPU architecture any of the QE suite related
codes.
We have successfully accelerated the linear algebra part of the QE suite
using a library called phiGEMM, some explicit computational kernels
(newd, addusdense, vloc_psi) and the 3D FFT for the single CPU/GPU
version. Both linear algebra (matrix multiplication) and the FFT
accelerated version make use of CUDA libraries. The porting is mainly
based on wrappers that permit the use of libraries for accelerators. The
distributed 3D FFT version is still in progress, since this porting
requires important changes of the current structure of the code and data
distribution. While running the parallel and distributed multi-GPUs
version it still uses the original 3D FFT implementations.
The phiGEMM library is distributed as an independent open-source
external package together with the Quantum ESPRESSO suite. It aims to
perform matrix multiplication ([SDZ]GEMM) taking advantage of the
underlying BLAS kernel functions on both CPU and NVIDIA CUDA-based GPU,
mixing and overlapping computation between the host (CPU) and the
accelerator (GPU). Whatever code makes intensive use of GEMM it can
derive a significant advantage linking this library when running on a
CPU/GPU hybrid system.
Even if the 3D FFT is accelerated only for a single CPU process (not
when using MPI), other parts are already enabled to take advantage of a
distributed parallel hybrid system. All of this allows PWscf to
potentially use distributed message-passing parallelization (MPI) plus
multi-threading (OpenMP) plus accelerators (NVIDIA GPUs) all together
and produce good performance enhancement using the latest version of
NVIDIA GPUs (high performance double precision is needed). This porting
activity is still in progress, especially the parallel 3D FFT component
that represents a bottleneck for large calculations. We have been
testing this beta release using some small/medium benchmarks used in the
DEISA official bench-suite and several GPU hardware (Tesla and Fermi
architectures). Special thanks goes to both E4 Computer Engineering and
CEA for providing access to hybrid GPU systems with differing
configurations to those available at ICHEC.
We look forward with interest to receiving any suggestions for
improvement, feedback or request for collaboration by anyone who is
interested to try and validate PWscf CUDA version on different platforms
using different scientific cases.For additional information please
contact qe-gpu at ichec.ie or replay at this mail. We'll be shortly
available a dedicated forum q-e-gpgpu at qe-forge.org
<http://qe-forge.org/mail/?group_id=10>. Please use this list for bug
report and any other issue related to the use of the PWscf GPU version.
Although compilation of the GPU implementation is fairly
straight-forward, we kindly suggest that users carefully read the
README.GPU that is included. The intrinsic characteristics of hybrid
multi- and many-core systems require careful consideration to best
exploit the available computing power.
Any and all suggestions are welcome and will be very much appreciated.
Ivan Girotto & Filippo Spiga
---
ICHEC GPU developer team
The Tower - 7th floor
Trinity Technology& Enterprise Campus
Grand Canal Quay - Dublin 2 - Ireland
+353-1-5241608 (ph) / +353-1-7645845 (fax)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.democritos.it/pipermail/pw_forum/attachments/20110505/8af10ede/attachment-0001.htm
More information about the Pw_forum
mailing list