[gmx-users] problem with mpi run in REMD simulation

Mark Abraham mark.j.abraham at gmail.com
Sun Feb 9 20:30:16 CET 2020


Hi,

First, make sure you can run a normal single-replica simulation with MPI on
this machine, so that you know you have the mechanics right. Follow the
cluster's documentation for setting up the scripts and calling MPI. I
suspect your problem starts here, perhaps with having a suitable working
directory to which to write your output files.

Next, be aware that locality is very important for the cores that
participate in a single simulation. It's not enough to choose five cores
per replica and deduce that 28 nodes can give enough total cores. Each
replica should be assigned cores within the same node (or lose lots of
performance), so you will have to do some arithmetic to choose how many
replicas per node are suitable to fit all of the cores of those replicas
within a node. The best choice for the number of replicas will depend on
the number of cores per node.

Mark

On Sun., 9 Feb. 2020, 10:39 Mohammad Madani, <mohammad.madani at uconn.edu>
wrote:

> Dear all users
> I want to run a REMD simulation on the stampede2 cluster.
> I have 376 replicas. when I run the simulation on 28 nodes and 1880 mpi
> task  ( 5core per replica) I get the error.
>
> [proxy:0:0 at c403-004.stampede2.tacc.utexas.edu] HYDU_create_process
> (../../utils/launch/launch.c:825): execvp error on file traj.trr (No such
> file or directory)
>
> I do not know what is the problem.
>
> Could you please help me?
>
> this is my bash script file:
> #!/bin/bash
> #SBATCH -J myjob
> #SBATCH -o myjob.%j.out
> #SBATCH -e myjob.%j.err
>
> #SBATCH --mail-user=mohammad.madani at uconn.edu
> #SBATCH --mail-type=ALL
> #SBATCH -A TG-MCB180008
>
> #SBATCH -p normal
> #SBATCH -N 28
> #SBATCH -n 1880
> #SBATCH -t 48:00:00
>
> module load gromacs/2019.4
>
> module load intel/18.0.2
>
> module load impi/18.0.2
> module mvapich2/2.3.1
> ibrun /opt/apps/intel18/impi/18.0.2/gromacs/2019.4/bin/mdrun_mpi -s -o
> traj.trr -c nvt.gro -e ener.edr -g md.log -replex 500 -multidir equil0
> equil1 ..... equil375
>
> Many thanks
>
>
> [image: Mailtrack]
> <
> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&
> >
> Sender
> notified by
> Mailtrack
> <
> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&
> >
> 02/09/20,
> 04:38:01 AM
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list