[gmx-users] why two GTX780ti is slower?

Szilárd Páll pall.szilard at gmail.com
Wed May 14 18:14:57 CEST 2014


MPI and thread-MPI, as the error messages state are mutually
exclusive, hence passing -ntmpi to the MPI-enabled binary is just
incorrect. I suggest that you read up on the GROMACS parallelization:
www.gromacs.org/Documentation/Acceleration_and_parallelization

Secondly, those MPI messages may mean that something is wrong with
your MPI installation.

Finally, you have still not shown the *log files* of those runs that
you are comparing, so there is no way of telling whether the decreased
performance is reasonable or not and what the cause is.

You may want to try to tweak the launch configs a bit e.g.:
mdrun -nt 4 -ntomp 5 -gpu_id 0011
mdrun -nt 10 -ntomp 2 -gpu_id 0000011111

Alternatively, you can simply run two simulations per node, that way
you'll surely get linear scaling of the aggregate throughput!

Cheers,
--
Szilárd


On Wed, May 14, 2014 at 5:21 PM, Albert <mailmd2011 at gmail.com> wrote:
> Hi Szilárd:
>
> thx for the comments.
>
> The log file in my previous thread comes from command line:
>
> tmp/input> mpirun -np 2 mdrun_mpi -s 4.tpr
>
> [mpiexec at cudaA.compuBio] HYDU_getfullhostname
> (./utils/others/others.c:136): getaddrinfo error (hostname: cudaA.compuBio,
> error: Name or service not known)
> [mpiexec at cudaA.compuBio] HYDU_sock_create_and_listen_portstr
> (./utils/sock/sock.c:999): unable to get local hostname
> [mpiexec at cudaA.compuBio] HYD_pmci_launch_procs
> (./pm/pmiserv/pmiserv_pmci.c:313): unable to create PMI port
> [mpiexec at cudaA.compuBio] main (./ui/mpich/mpiexec.c:877): process manager
> returned error launching processes
>
>
> if I add -ntmpi in the command line, it failed with the same errors:
>
> tmp/input> mpirun -np 2 mdrun_mpi -s 4.tpr -ntmpi 2
>
> [mpiexec at cudaA.compuBio] HYDU_getfullhostname
> (./utils/others/others.c:136): getaddrinfo error (hostname: cudaA.compuBio,
> error: Name or service not known)
> [mpiexec at cudaA.compuBio] HYDU_sock_create_and_listen_portstr
> (./utils/sock/sock.c:999): unable to get local hostname
> [mpiexec at cudaA.compuBio] HYD_pmci_launch_procs
> (./pm/pmiserv/pmiserv_pmci.c:313): unable to create PMI port
> [mpiexec at cudaA.compuBio] main (./ui/mpich/mpiexec.c:877): process manager
> returned error launching processes
>
>
> here are more tries:
>
> tmp/input> mdrun_mpi -s 4.tpr -ntmpi 2
>
> Program mdrun_mpi, VERSION 4.6.5
> Source code file:
> /home/albert/software/gromacs/gromacs-4.6.5/src/kernel/runner.c, line: 798
> Fatal error:
> Setting the number of thread-MPI threads is only supported with thread-MPI
> and Gromacs was compiled without thread-MPI
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
>
>
> I also compiled a thread version without MPI, and it can work, but two GPU
> thread efficiency is much slower than MPI with single GPU running: 30ns/day
> vs 40ns/day
>
> tmp/input> mdrun -s 4.tpr -ntmpi 2
>
>
> Using 2 MPI threads
> Using 10 OpenMP threads per tMPI thread
> 2 GPUs detected:
>   #0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC:  no, stat:
> compatible
>   #1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC:  no, stat:
> compatible
> 2 GPUs auto-selected for this run.
> Mapping of GPUs to the 2 PP ranks in this node: #0, #1
>
>
>
>
>
> 2014-05-14 15:34 GMT+02:00 Szilárd Páll <pall.szilard at gmail.com>:
>
>> This just tells that two GPU-s were detected but only the first one
>> was automatically selected to be used - presumably because you
>> manually specified the number of ranks (-np or -ntmpi) to be one.
>>
>> However, your mail contains neither the command line you started mdrun
>> with, nor (a link to) the log file mdrun produces.
>>
>> --
>> Szilárd
>>
>>
>> On Wed, May 14, 2014 at 2:43 PM, Albert <mailmd2011 at gmail.com> wrote:
>> > Hi Mark:
>> >
>> > thanks a lot for the messages. In most cases, job failed with following
>> > messages if I used intel MPI to evoke two GPU:
>> >
>> > [mpiexec at cudaA.compuBio] HYDU_getfullhostname
>> > (./utils/others/others.c:136): getaddrinfo error (hostname:
>> cudaA.compuBio,
>> > error: Name or service not known)
>> > [mpiexec at cudaA.compuBio] HYDU_sock_create_and_listen_portstr
>> > (./utils/sock/sock.c:999): unable to get local hostname
>> > [mpiexec at cudaA.compuBio] HYD_pmci_launch_procs
>> > (./pm/pmiserv/pmiserv_pmci.c:313): unable to create PMI port
>> > [mpiexec at cudaA.compuBio] main (./ui/mpich/mpiexec.c:877): process
>> manager
>> > returned error launching processes
>> >
>> >
>> > for one GPU job, here is the informations:
>> >
>> > 2 GPUs detected on host cudaA.compuBio:
>> >   #0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC:  no, stat:
>> > compatible
>> >   #1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC:  no, stat:
>> > compatible
>> >
>> > 1 GPU auto-selected for this run.
>> > Mapping of GPU to the 1 PP rank in this node: #0
>> >
>> >
>> > NOTE: potentially sub-optimal launch configuration, mdrun_mpi started
>> with
>> > less
>> >       PP MPI process per node than GPUs available.
>> >       Each PP MPI process can use only one GPU, 1 GPU per node will be
>> used.
>> >
>> > thank you verymuch.
>> >
>> >
>> > 2014-05-14 9:11 GMT+02:00 Mark Abraham <mark.j.abraham at gmail.com>:
>> >
>> >> Hi,
>> >>
>> >> Nobody can tell unless you upload .log files and mdrun command lines
>> >> somewhere
>> >>
>> >> Mark
>> >>
>> >>
>> >> On Wed, May 14, 2014 at 7:03 AM, Albert <mailmd2011 at gmail.com> wrote:
>> >>
>> >> > Hello:
>> >> >
>> >> > I've compiled Gromacs in a GPU machine with two GTX780Ti with
>> following
>> >> > command:
>> >> >
>> >> > env CC=icc F77=ifort CXX=icpc CMAKE_PREFIX_PATH=/soft/intel/
>> >> > mkl/include/fftw:/soft/intel/impi/4.1.3.049/intel64:/soft/
>> >> > intel/mkl/lib/intel64 <http://4.1.3.049/intel64:/
>> >> > soft/intel/mkl/lib/intel64> cmake .. -DBUILD_SHARED_LIB=OFF
>> >> > -DBUILD_TESTING=OFF -DCMAKE_INSTALL_PREFIX=/soft/gromacs-5.0rc1
>> >> > -DGMX_MPI=ON -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
>> >> >
>> >> >
>> >> > I've got 40ns/day for a 65,000 atoms membrane system under CHARMM36 FF
>> >> > when one GPU is used, but it is only 27ns/day if I used two GPUs. In
>> >> > another machine with two GTX690, I used ICC+openmpi instead of
>> ICC+intel
>> >> > MPI. The same system, I obtained 40ns/day for one GPU, and 60ns/day
>> for
>> >> two
>> >> > GPU.
>> >> >
>> >> > I am just wondering, why it is even slower with two GTX780Ti for my
>> >> > system, which doesn't happen to GTX690?
>> >> >
>> >> > thx a lot
>> >> >
>> >> > Albert
>> >> > --
>> >> > Gromacs Users mailing list
>> >> >
>> >> > * Please search the archive at http://www.gromacs.org/
>> >> > Support/Mailing_Lists/GMX-Users_List before posting!
>> >> >
>> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >> >
>> >> > * For (un)subscribe requests visit
>> >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> >> > send a mail to gmx-users-request at gromacs.org.
>> >> >
>> >> --
>> >> Gromacs Users mailing list
>> >>
>> >> * Please search the archive at
>> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> >> posting!
>> >>
>> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >>
>> >> * For (un)subscribe requests visit
>> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> >> send a mail to gmx-users-request at gromacs.org.
>> >>
>> > --
>> > Gromacs Users mailing list
>> >
>> > * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>> >
>> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >
>> > * For (un)subscribe requests visit
>> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list