[gmx-users] How to initiate parallel run on GPU cluster

Venkat Reddy venkat4bt at gmail.com
Thu Apr 7 14:35:45 CEST 2016


Thank you Mark for the quick response.
I tried to change -np option to 4. But it seems that mdrun is using only
one GPU in single node with four ranks. The nvidia-smi command shows

|    0      7977    C   /Apps/gromacs512/bin/gmx_mpi
130MiB |
|    0      7978    C   /Apps/gromacs512/bin/gmx_mpi
130MiB |
|    0      7979    C   /Apps/gromacs512/bin/gmx_mpi
130MiB |
|    0      7980    C   /Apps/gromacs512/bin/gmx_mpi
130MiB |

Also the job folder has four backed up copies of same run.

On Thu, Apr 7, 2016 at 5:07 PM, Mark Abraham <mark.j.abraham at gmail.com>
wrote:

> Hi,
>
> mpiexec.hydra -np 1 asks for a single MPI rank, which is what you got. But
> you need at least two, ie. at least one on each rank, and at least four if
> you want to make use of the two GPUs on each of two nodes.
>
> Mark
>
>
> On Thu, Apr 7, 2016 at 1:14 PM Venkat Reddy <venkat4bt at gmail.com> wrote:
>
> > Dear all,
> >
> > Please neglect my previous mail which was incomplete.
> >
> > I am trying to execute mdrun our GPU cluster with 7 nodes where each node
> > is populated by 16 processors and two K40 GPU cards. I have no problem
> with
> > mdrun on single node. However, when I try to execute parallel run on two
> > nodes with gmx_mpi executable (gromacs-5.1.2),  the performance is very
> > slow. When I logged into individual nodes, I found that mdrun is not
> > utilizing both GPUs. The generated log file shows the following message.
> >
> > Using 1 MPI process
> > Using 16 OpenMP threads
> >
> > 2 compatible GPUs are present, with IDs 0,1
> > 1 GPU auto-selected for this run.
> > Mapping of GPU ID to the 1 PP rank in this node: 0
> >
> >
> > NOTE: potentially sub-optimal launch configuration, gmx_mpi started with
> > less
> >       PP MPI process per node than GPUs available.
> >       Each PP MPI process can use only one GPU, 1 GPU per node will be
> > used.
> >
> > I read the manual and instructions in
> >
> >
> http://manual.gromacs.org/documentation/5.1/user-guide/mdrun-performance.html
> > to
> > execute the parallel run. But I couldn't find the right flags to initiate
> > it. Please help me in this aspect. The script I used to execute the
> > parallel run is given below.
> >
> > #! /bin/bash
> > #PBS -l cput=5000:00:00
> > #PBS -l select=2:ncpus=16:ngpus=2
> > #PBS -e errorfile.err
> > #PBS -o logfile.log
> > tpdir=`echo $PBS_JOBID | cut -f 1 -d .`
> > tempdir=$HOME/work/job$tpdir
> > mkdir -p $tempdir
> > cd $tempdir
> > cp -R $PBS_O_WORKDIR/* .
> > mpiexec.hydra -np 1 -hostfile $PBS_NODEFILE /Apps/gromacs512/bin/gmx_mpi
> > mdrun -v -dlb yes  -ntomp 16 -s equilibration3.tpr
> >
> >
> >
> > --
> > With Best Wishes
> > Venkat Reddy Chirasani
> > PhD student
> > Laboratory of Computational Biophysics
> > Department of Biotechnology
> > IIT Madras
> > Chennai
> > INDIA-600036
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>



-- 
With Best Wishes
Venkat Reddy Chirasani
PhD student
Laboratory of Computational Biophysics
Department of Biotechnology
IIT Madras
Chennai
INDIA-600036


More information about the gromacs.org_gmx-users mailing list