[gmx-users] 3 GPUs much faster than 2 GPUs with GROMACS-4.6.2 ???
David van der Spoel
spoel at xray.bmc.uu.se
Mon Dec 9 18:42:03 CET 2013
On 2013-12-09 18:27, yunshi11 . wrote:
> Hi all,
>
> I have a physical compute node with 2x 6-core Intel E5649 processors +
> three NVIDIA Tesla M2070s GPUs.
>
> First I tried using all 12 CPU cores + 3 GPUs for an equilibration run (of
> protein in TIP3 waters), which gave me 8.964 ns/day performance.
>
> But I noticed the PME mesh calculation, which I assume is done on CPU
> cores/OpenMP threads, has taken up 62% of the Wall t/G cycle. It seems that
> the CPU cores/OpenMP threads have too much work to do and the GPUs have to
> wait?
>
> PME mesh 3 4 50001 597.925 18177.760 62.0
>
>
>
>
> Thus, I tried running with all 12 CPU cores + 2 GPUs, which is more natural
> to me since the 6 cores of each Intel E5649 processor is tied to 1 GPU,
> making 6 CPU cores/OpenMP threads per MPI process. However, this resulted
> in a performance of only 5.694 ns/day, less than 2/3 of the previous run.
> Yet, the PME mess calculation took 55.6% of the Wall t/G cycle, NOT very
> different from the previous run.
>
> PME mesh 2 6 50001 843.186 25633.615 55.6
>
>
>
> Does anyone know why this is the case? Why would different numbers of GPUs
> affect the calculation the PME mesh?
>
And did you try 6 cores + 1 GPU as well?
And 12 cores without GPU?
Your system maybe too small to scale to so much processing power.
> Regards,
> Yun
>
--
David van der Spoel, Ph.D., Professor of Biology
Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone: +46184714205.
spoel at xray.bmc.uu.se http://folding.bmc.uu.se
More information about the gromacs.org_gmx-users
mailing list