[gmx-users] Running Gromacs in parallel
pall.szilard at gmail.com
Wed Sep 21 14:31:14 CEST 2016
Performance tuning is highly dependent on the simulation system and
the hardware you're running on. Questions like the ones you pose are
impossible to answer meaningfully without *full* log files (and
hardware specs including network).
Have you checked the performance checklist I linked above?
On Wed, Sep 21, 2016 at 11:36 AM, <jkrieger at mrc-lmb.cam.ac.uk> wrote:
> I wonder whether what I see that -np 108 and -ntomp 2 is best comes from
> using -multi 6 with 8-CPU nodes. That level of parallelism may then be
> necessary to trigger automatic segregation of PP and PME ranks. I'm not
> sure if I tried -np 54 and -ntomp 4, which would probably also do it. I
> compared mostly on 196 CPUs then found going up to 216 was better than 196
> with -ntomp 2 and pure MPI (-ntomp 1) was considerably worse for both.
> Would people recommend to go back to 196 which allows 4 whole nodes per
> replica and playing with -npme and -ntomp_pme?
>> Hi Thanh Le,
>> Assuming all the nodes are the same (9 nodes with 12 CPUs) then you could
>> try the following
>> mpirun -np 9 --map-by node mdrun -ntomp 12 ...
>> mpirun -np 18 mdrun -ntomp 6 ...
>> mpirun -np 54 mdrun -ntomp 2 ...
>> Which of these works best will depend on your setup.
>> Using the whole cluster for one job may not be the most efficient way. I
>> found on our cluster that once I reach 216 CPUs (equivalent settings from
>> the queuing system to -np 108 and -ntomp 2), I can't do better by adding
>> more nodes (where presumably communication becomes an issue). In addition
>> to running -multi or -multidir jobs, which takes the load off
>> communication a bit, it may also be worth having separate jobs and using
>> -pin on and -pinoffset.
>> Best wishes
>>> Hi everyone,
>>> I have a question concerning running gromacs in parallel. I have read
>>> but I still dont quite understand how to run it efficiently.
>>> My gromacs version is 4.5.4
>>> The cluster I am using has CPUs total: 108 and 4 hosts up.
>>> The node iam using:
>>> Architecture: x86_64
>>> CPU op-mode(s): 32-bit, 64-bit
>>> Byte Order: Little Endian
>>> CPU(s): 12
>>> On-line CPU(s) list: 0-11
>>> Thread(s) per core: 2
>>> Core(s) per socket: 6
>>> Socket(s): 1
>>> NUMA node(s): 1
>>> Vendor ID: AuthenticAMD
>>> CPU family: 21
>>> Model: 2
>>> Stepping: 0
>>> CPU MHz: 1400.000
>>> BogoMIPS: 5200.57
>>> Virtualization: AMD-V
>>> L1d cache: 16K
>>> L1i cache: 64K
>>> L2 cache: 2048K
>>> L3 cache: 6144K
>>> NUMA node0 CPU(s): 0-11
>>> MPI is already installed. I also have permission to use the cluster as
>>> much as I can.
>>> My question is: how should I write my mdrun command run to utilize all
>>> possible cores and nodes?
>>> Thanh Le
>>> Gromacs Users mailing list
>>> * Please search the archive at
>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> a mail to gmx-users-request at gromacs.org.
> Gromacs Users mailing list
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users