[gmx-users] How to improve the performance of simulation in HPC (finding optimal number of nodes and processors)

Fri Jun 2 12:42:42 CEST 2017

I suggest that you try to understand the parallelization option of mdrun
first. You seem to be mixing MPI and thread-MPI. The latter won't work for
multi-node runs. If you sort that out and launch with correctly set up PBS
parameters (ranks and cores/threads per ran), your runs should be fine.

http://manual.gromacs.org/documentation/2016.3/user-guide/mdrun-performance.html

> ###Using 6 MPI threads
> Using 28 OpenMPI threads per tMPI thread

That's obviously not the output of mdrun (especialy the "OpenMPI threads").
Please provide exact command line options, output and log, not something
that resembles the output.

--
Szilárd

On Fri, Jun 2, 2017 at 7:49 AM, Santhosh Kumar Nagarajan <
santhoshrajan90 at gmail.com> wrote:

> Dear users,
>
> I am simulating a protein having 285 residues in a Gromacs environment
> installed in our University's HPC. Gromacs version: gmx, version 2016.3.
>
> I have tried to run the simulation using the standard
> "select=1:ncpus=28:mpiprocs=28" provided by our HPC admin (in pbs script).
> The same number of nodes and processors is given by the users of various
> other software (like VASP, Mathematica), which run perfectly. But when I
> tried to give the same number, the simulation was running too slow
> (approximately one ns per day). So I tried to change the number of nodes
> and processors, but nothing seems to improve the performance.
>
> For example:
> Many times I got fatal errors, saying,
>
> ###Using 6 MPI threads
> Using 28 OpenMPI threads per tMPI thread
>
> WARNING: Oversubscribing the available 28 logical CPU cores with 168
> threads. This will cause considerable performance loss!
>
> Fatal error: Your choice of number of MPI ranks and amount of resources
> result in using 28 OpenMP threads per rank, which is most likely
> insufficient. The optimum is usually between 1 and 6 threads per rank. If
> you want to run with this setup, specify the -ntomp option. But we suggest
> to change the number of MPI ranks (option -ntmpi).###
>
> After this, I have added the "-ntomp option", which skipped the warning.
> But didn't improve the performance.
> Recently I tried "select=1:ncpus=6:mpiprocs=56, which runs the simulation
> using 4 MPI threads and 6 OpenMP threads. I think this is a too low
> performance for our HPC, as other software runs with better performance.
>
> Below, I am providing the pbs.sh file which I have used to run the
> simulation in HPC. Can anyone please help me, what I am doing wrong.
>
> ###PBS file used
>
> #i/bin/bash
> #PBS -N my_protein_name
> #PBS -q work-01
> #PBS -l select=1:ncpus=28:mpiprocs=28
> #PBS -j oe
> #PBS -V
> cd $PBS_O_WORKDIR
> cat $PBS_NODEFILE>./pbsnodelist
> CORES='cat./pbsnodelist|wc -1'
> source /opt/software/intel/parallel_studi_xe_2017.2.050/psxevars.sh
> intel64
> gmx mdrun -ntmpi 28 -deffnm md_0_1
>
> ###
>
>
> Specifications of the HPC:
>
> One master node+40 computer nodes
> 40X2x Intel Xeon E5-2680v4
> (28 threads for one E5-2680)
> 10 TB RAM
> 40 TB Hard disk
>
> Master
> 1X2XE5-2650v4
> (24 threads for E5-2650)
> 128 GB RAM
> 80TB Hard Disk
>
> Thank you
>
> Regards
> --
> Santhosh Kumar Nagarajan
> PhD Research Scholar
> Department of Genetic Engineering
> SRM University
> Kattankulathur
> Chennai - 603203
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/
> Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>