[gmx-users] why two GTX780ti is slower?

Wed May 14 17:21:23 CEST 2014

Hi Szilárd:

thx for the comments.

The log file in my previous thread comes from command line:

tmp/input> mpirun -np 2 mdrun_mpi -s 4.tpr

[mpiexec at cudaA.compuBio] HYDU_getfullhostname
(./utils/others/others.c:136): getaddrinfo error (hostname: cudaA.compuBio,
error: Name or service not known)
[mpiexec at cudaA.compuBio] HYDU_sock_create_and_listen_portstr
(./utils/sock/sock.c:999): unable to get local hostname
[mpiexec at cudaA.compuBio] HYD_pmci_launch_procs
(./pm/pmiserv/pmiserv_pmci.c:313): unable to create PMI port
[mpiexec at cudaA.compuBio] main (./ui/mpich/mpiexec.c:877): process manager
returned error launching processes

if I add -ntmpi in the command line, it failed with the same errors:

tmp/input> mpirun -np 2 mdrun_mpi -s 4.tpr -ntmpi 2

[mpiexec at cudaA.compuBio] HYDU_getfullhostname
(./utils/others/others.c:136): getaddrinfo error (hostname: cudaA.compuBio,
error: Name or service not known)
[mpiexec at cudaA.compuBio] HYDU_sock_create_and_listen_portstr
(./utils/sock/sock.c:999): unable to get local hostname
[mpiexec at cudaA.compuBio] HYD_pmci_launch_procs
(./pm/pmiserv/pmiserv_pmci.c:313): unable to create PMI port
[mpiexec at cudaA.compuBio] main (./ui/mpich/mpiexec.c:877): process manager
returned error launching processes

here are more tries:

tmp/input> mdrun_mpi -s 4.tpr -ntmpi 2

Program mdrun_mpi, VERSION 4.6.5
Source code file:
/home/albert/software/gromacs/gromacs-4.6.5/src/kernel/runner.c, line: 798
Fatal error:
Setting the number of thread-MPI threads is only supported with thread-MPI
and Gromacs was compiled without thread-MPI
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

I also compiled a thread version without MPI, and it can work, but two GPU
thread efficiency is much slower than MPI with single GPU running: 30ns/day
vs 40ns/day

tmp/input> mdrun -s 4.tpr -ntmpi 2

Using 2 MPI threads
Using 10 OpenMP threads per tMPI thread
2 GPUs detected:
  #0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC:  no, stat:
compatible
  #1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC:  no, stat:
compatible
2 GPUs auto-selected for this run.
Mapping of GPUs to the 2 PP ranks in this node: #0, #1

2014-05-14 15:34 GMT+02:00 Szilárd Páll <pall.szilard at gmail.com>:

> This just tells that two GPU-s were detected but only the first one
> was automatically selected to be used - presumably because you
> manually specified the number of ranks (-np or -ntmpi) to be one.
>
> However, your mail contains neither the command line you started mdrun
> with, nor (a link to) the log file mdrun produces.
>
> --
> Szilárd
>
>
> On Wed, May 14, 2014 at 2:43 PM, Albert <mailmd2011 at gmail.com> wrote:
> > Hi Mark:
> >
> > thanks a lot for the messages. In most cases, job failed with following
> > messages if I used intel MPI to evoke two GPU:
> >
> > [mpiexec at cudaA.compuBio] HYDU_getfullhostname
> > (./utils/others/others.c:136): getaddrinfo error (hostname:
> cudaA.compuBio,
> > error: Name or service not known)
> > [mpiexec at cudaA.compuBio] HYDU_sock_create_and_listen_portstr
> > (./utils/sock/sock.c:999): unable to get local hostname
> > [mpiexec at cudaA.compuBio] HYD_pmci_launch_procs
> > (./pm/pmiserv/pmiserv_pmci.c:313): unable to create PMI port
> > [mpiexec at cudaA.compuBio] main (./ui/mpich/mpiexec.c:877): process
> manager
> > returned error launching processes
> >
> >
> > for one GPU job, here is the informations:
> >
> > 2 GPUs detected on host cudaA.compuBio:
> >   #0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC:  no, stat:
> > compatible
> >   #1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC:  no, stat:
> > compatible
> >
> > 1 GPU auto-selected for this run.
> > Mapping of GPU to the 1 PP rank in this node: #0
> >
> >
> > NOTE: potentially sub-optimal launch configuration, mdrun_mpi started
> with
> > less
> >       PP MPI process per node than GPUs available.
> >       Each PP MPI process can use only one GPU, 1 GPU per node will be
> used.
> >
> > thank you verymuch.
> >
> >
> > 2014-05-14 9:11 GMT+02:00 Mark Abraham <mark.j.abraham at gmail.com>:
> >
> >> Hi,
> >>
> >> Nobody can tell unless you upload .log files and mdrun command lines
> >> somewhere
> >>
> >> Mark
> >>
> >>
> >> On Wed, May 14, 2014 at 7:03 AM, Albert <mailmd2011 at gmail.com> wrote:
> >>
> >> > Hello:
> >> >
> >> > I've compiled Gromacs in a GPU machine with two GTX780Ti with
> following
> >> > command:
> >> >
> >> > env CC=icc F77=ifort CXX=icpc CMAKE_PREFIX_PATH=/soft/intel/
> >> > mkl/include/fftw:/soft/intel/impi/4.1.3.049/intel64:/soft/
> >> > intel/mkl/lib/intel64 <http://4.1.3.049/intel64:/
> >> > soft/intel/mkl/lib/intel64> cmake .. -DBUILD_SHARED_LIB=OFF
> >> > -DBUILD_TESTING=OFF -DCMAKE_INSTALL_PREFIX=/soft/gromacs-5.0rc1
> >> > -DGMX_MPI=ON -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
> >> >
> >> >
> >> > I've got 40ns/day for a 65,000 atoms membrane system under CHARMM36 FF
> >> > when one GPU is used, but it is only 27ns/day if I used two GPUs. In
> >> > another machine with two GTX690, I used ICC+openmpi instead of
> ICC+intel
> >> > MPI. The same system, I obtained 40ns/day for one GPU, and 60ns/day
> for
> >> two
> >> > GPU.
> >> >
> >> > I am just wondering, why it is even slower with two GTX780Ti for my
> >> > system, which doesn't happen to GTX690?
> >> >
> >> > thx a lot
> >> >
> >> > Albert
> >> > --
> >> > Gromacs Users mailing list
> >> >
> >> > * Please search the archive at http://www.gromacs.org/
> >> > Support/Mailing_Lists/GMX-Users_List before posting!
> >> >
> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> >
> >> > * For (un)subscribe requests visit
> >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> > send a mail to gmx-users-request at gromacs.org.
> >> >
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-request at gromacs.org.
> >>
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>