[gmx-users] 3 GPUs much faster than 2 GPUs with GROMACS-4.6.2 ???
David van der Spoel
spoel at xray.bmc.uu.se
Mon Dec 9 18:42:03 CET 2013
On 2013-12-09 18:27, yunshi11 . wrote:
> Hi all,
> I have a physical compute node with 2x 6-core Intel E5649 processors +
> three NVIDIA Tesla M2070s GPUs.
> First I tried using all 12 CPU cores + 3 GPUs for an equilibration run (of
> protein in TIP3 waters), which gave me 8.964 ns/day performance.
> But I noticed the PME mesh calculation, which I assume is done on CPU
> cores/OpenMP threads, has taken up 62% of the Wall t/G cycle. It seems that
> the CPU cores/OpenMP threads have too much work to do and the GPUs have to
> PME mesh 3 4 50001 597.925 18177.760 62.0
> Thus, I tried running with all 12 CPU cores + 2 GPUs, which is more natural
> to me since the 6 cores of each Intel E5649 processor is tied to 1 GPU,
> making 6 CPU cores/OpenMP threads per MPI process. However, this resulted
> in a performance of only 5.694 ns/day, less than 2/3 of the previous run.
> Yet, the PME mess calculation took 55.6% of the Wall t/G cycle, NOT very
> different from the previous run.
> PME mesh 2 6 50001 843.186 25633.615 55.6
> Does anyone know why this is the case? Why would different numbers of GPUs
> affect the calculation the PME mesh?
And did you try 6 cores + 1 GPU as well?
And 12 cores without GPU?
Your system maybe too small to scale to so much processing power.
David van der Spoel, Ph.D., Professor of Biology
Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone: +46184714205.
spoel at xray.bmc.uu.se http://folding.bmc.uu.se
More information about the gromacs.org_gmx-users