[gmx-users] problem with mpi run in REMD simulation

Mark Abraham mark.j.abraham at gmail.com
Mon Feb 10 11:32:41 CET 2020


Hi,

On Mon, 10 Feb 2020 at 04:34, Mohammad Madani <mohammad.madani at uconn.edu>
wrote:

> Dear Mark
> Thank you for your reply.
> Could you please say me Whether using 47 nodes and each node has 48 cores
> for 376 replicas is good or not?
> In fact 6 core per replicas and all cores that use for each replica is in
> the one node.
>

Yes, 4 or 6 cores per replica is a much better choice, because they are
divisors of 48.

Whether 47 replicas is a wise choice is a further question. Assuming the
usual case where you're only interested in the sampling at the bottom T, I
strongly suggest starting with a short preliminary simulation with many
fewer replicas (e.g. 10) over a narrower T range and observe how long you
have to simulate to even get one replica to be at the top T and then reach
the bottom T. That time will be much much longer for 376 replicas. My guess
is that you won't want to spend that much computer time :-)

Also, I use stampede 2 cluster for my simulation of remd. When I use KNL
> nodes my simulation was failed but when I use skx nodes remd works well.
>

KNL/SKX is irrelevant to GROMACS and REMD. If there's issues, then they are
almost certainly in how the version of GROMACS was built, how the job is
being run, or how the job is using the GROMACS version.

Mark


> On Sun, Feb 9, 2020 at 2:30 PM Mark Abraham <mark.j.abraham at gmail.com>
> wrote:
>
> > Hi,
> >
> > First, make sure you can run a normal single-replica simulation with MPI
> on
> > this machine, so that you know you have the mechanics right. Follow the
> > cluster's documentation for setting up the scripts and calling MPI. I
> > suspect your problem starts here, perhaps with having a suitable working
> > directory to which to write your output files.
> >
> > Next, be aware that locality is very important for the cores that
> > participate in a single simulation. It's not enough to choose five cores
> > per replica and deduce that 28 nodes can give enough total cores. Each
> > replica should be assigned cores within the same node (or lose lots of
> > performance), so you will have to do some arithmetic to choose how many
> > replicas per node are suitable to fit all of the cores of those replicas
> > within a node. The best choice for the number of replicas will depend on
> > the number of cores per node.
> >
> > Mark
> >
> > On Sun., 9 Feb. 2020, 10:39 Mohammad Madani, <mohammad.madani at uconn.edu>
> > wrote:
> >
> > > Dear all users
> > > I want to run a REMD simulation on the stampede2 cluster.
> > > I have 376 replicas. when I run the simulation on 28 nodes and 1880 mpi
> > > task  ( 5core per replica) I get the error.
> > >
> > > [proxy:0:0 at c403-004.stampede2.tacc.utexas.edu] HYDU_create_process
> > > (../../utils/launch/launch.c:825): execvp error on file traj.trr (No
> such
> > > file or directory)
> > >
> > > I do not know what is the problem.
> > >
> > > Could you please help me?
> > >
> > > this is my bash script file:
> > > #!/bin/bash
> > > #SBATCH -J myjob
> > > #SBATCH -o myjob.%j.out
> > > #SBATCH -e myjob.%j.err
> > >
> > > #SBATCH --mail-user=mohammad.madani at uconn.edu
> > > #SBATCH --mail-type=ALL
> > > #SBATCH -A TG-MCB180008
> > >
> > > #SBATCH -p normal
> > > #SBATCH -N 28
> > > #SBATCH -n 1880
> > > #SBATCH -t 48:00:00
> > >
> > > module load gromacs/2019.4
> > >
> > > module load intel/18.0.2
> > >
> > > module load impi/18.0.2
> > > module mvapich2/2.3.1
> > > ibrun /opt/apps/intel18/impi/18.0.2/gromacs/2019.4/bin/mdrun_mpi -s -o
> > > traj.trr -c nvt.gro -e ener.edr -g md.log -replex 500 -multidir equil0
> > > equil1 ..... equil375
> > >
> > > Many thanks
> > >
> > >
> > > [image: Mailtrack]
> > > <
> > >
> >
> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&
> > > >
> > > Sender
> > > notified by
> > > Mailtrack
> > > <
> > >
> >
> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&
> > > >
> > > 02/09/20,
> > > 04:38:01 AM
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-request at gromacs.org.
> > >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list