[gmx-users] Problems with REMD in Gromacs 4.6.3

Mark Abraham mark.j.abraham at gmail.com
Wed Jul 17 20:08:19 CEST 2013


You tried ppn3 (with and without --loadbalance)?

Mark

On Wed, Jul 17, 2013 at 6:30 PM, gigo <gigo at ibb.waw.pl> wrote:
> On 2013-07-13 11:10, Mark Abraham wrote:
>>
>> On Sat, Jul 13, 2013 at 1:24 AM, gigo <gigo at ibb.waw.pl> wrote:
>>>
>>> On 2013-07-12 20:00, Mark Abraham wrote:
>>>>
>>>>
>>>> On Fri, Jul 12, 2013 at 4:27 PM, gigo <gigo at ibb.waw.pl> wrote:
>>>>>
>>>>>
>>>>> Hi!
>>>>>
>>>>> On 2013-07-12 11:15, Mark Abraham wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> What does --loadbalance do?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> It balances the total number of processes across all allocated nodes.
>>>>
>>>>
>>>>
>>>> OK, but using it means you are hostage to its assumptions about balance.
>>>
>>>
>>>
>>> Thats true, but as long as I do not try to use more resources that the
>>> torque gives me, everything is OK. The question is, what is a proper way
>>> of
>>> running multiple simulations in parallel with MPI that are further
>>> parallelized with OpenMP, when pinning fails? I could not find any other.
>>
>>
>> I think pinning fails because you are double-crossing yourself. You do
>> not want 12 MPI processes per node, and that is likely what ppn is
>> setting. AFAIK your setup should work, but I haven't tested it.
>>
>>>>
>>>>> The
>>>>> thing is that mpiexec does not know that I want each replica to fork to
>>>>> 4
>>>>> OpenMP threads. Thus, without this option and without affinities (in a
>>>>> sec
>>>>> about it) mpiexec starts too many replicas on some nodes - gromacs
>>>>> complains
>>>>> about the overload then - while some cores on other nodes are not used.
>>>>> It
>>>>> is possible to run my simulation like that:
>>>>>
>>>>> mpiexec mdrun_mpi -v -cpt 20 -multi 144 -replex 2000 -cpi (without
>>>>> --loadbalance for mpiexec and without -ntomp for mdrun)
>>>>>
>>>>> Then each replica runs on 4 MPI processes (I allocate 4 times more
>>>>> cores
>>>>> then replicas and mdrun sees it). The problem is that it is much slower
>>>>> than
>>>>> using OpenMP for each replica. I did not find any other way than
>>>>> --loadbalance in mpiexec and then -multi 144 -ntomp 4 in mdrun to use
>>>>> MPI
>>>>> and OpenMP at the same time on the torque-controlled cluster.
>>>>
>>>>
>>>>
>>>> That seems highly surprising. I have not yet encountered a job
>>>> scheduler that was completely lacking a "do what I tell you" layout
>>>> scheme. More importantly, why are you using #PBS -l nodes=48:ppn=12?
>>>
>>>
>>>
>>> I thing that torque is very similar to all PBS-like resource managers in
>>> this regard. It actually does what I tell it to do. There are 12-core
>>> nodes,
>>> I ask for 48 of them - I get them (simple #PBS -l ncpus=576 does not
>>> work),
>>> end of story. Now, the program that I run is responsible for populating
>>> resources that I got.
>>
>>
>> No, that's not the end of the story. The scheduler and the MPI system
>> typically cooperate to populate the MPI processes on the hardware, set
>> OMP_NUM_THREADS, set affinities, etc. mdrun honours those if they are
>> set.
>
>
> I was able to run what I wanted flawlessly on another cluster with PBS-Pro.
> The torque cluster seem to work like I said ("the end of story" behaviour).
> REMD runs well on torque when I give a whole physical node to one replica.
> Otherwise the simulation does not go or the pinning fails (sometimes
> partially). I run out of options, I did not find any working
> example/documentation on running hybrid MPI/OpenMP jobs in torque. It seems
> that I stumbled upon limitations of this resource manager, and it is not
> really the Gromacs issue.
> Best Regards,
> Grzegorz
>
>
>>
>> You seem to be using 12 because you know there are 12 cores per node.
>> The scheduler should know that already. ppn should be a command about
>> what to do with the hardware, not a description of what it is. More to
>> the point, you should read the docs and be sure what it does.
>>
>>>> Surely you want 3 MPI processes per 12-core node?
>>>
>>>
>>>
>>> Yes - I want each node to run 3 MPI processes. Preferably, I would like
>>> to
>>> run each MPI process on separate node (spread on 12 cores with OpenMP)
>>> but I
>>> will not get as much of resources. But again, without the --loadbalance
>>> hack
>>> I would not be able to properly populate the nodes...
>>
>>
>> So try ppn 3!
>>
>>>>
>>>>>> What do the .log files say about
>>>>>> OMP_NUM_THREADS, thread affinities, pinning, etc?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Each replica logs:
>>>>> "Using 1 MPI process
>>>>> Using 4 OpenMP threads",
>>>>> That is is correct. As I said, the threads are forked, but 3 out of 4
>>>>> don't
>>>>> do anything, and the simulation does not go at all.
>>>>>
>>>>> About affinities Gromacs says:
>>>>> "Can not set thread affinities on the current platform. On NUMA systems
>>>>> this
>>>>> can cause performance degradation. If you think your platform should
>>>>> support
>>>>> setting affinities, contact the GROMACS developers."
>>>>>
>>>>> Well, the "current platform" is normal x86_64 cluster, but the whole
>>>>> information about resources is passed by Torque to OpenMPI-linked
>>>>> Gromacs.
>>>>> Can it be that mdrun sees the resources allocated by torque as a big
>>>>> pool
>>>>> of
>>>>> cpus and misses the information about nodes topology?
>>>>
>>>>
>>>>
>>>> mdrun gets its processor topology from the MPI layer, so that is where
>>>> you need to focus. The error message confirms that GROMACS sees things
>>>> that seem wrong.
>>>
>>>
>>>
>>> Thank you, I will take a look. But the first thing I want to do is
>>> finding
>>> the reason why Gromacs 4.6.3 is not able to run on my (slightly weird, I
>>> admit) setup, while 4.6.2 does it very well.
>>
>>
>> 4.6.2 had a bug that inhibited any MPI-based mdrun from attempting to
>> set affinities. It's still not clear why ppn 12 worked at all.
>> Apparently mdrun was able to float some processes around to get
>> something that worked. The good news is that when you get it working
>> in 4.6.3, you will see a performance boost.
>>
>> Mark
>
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists



More information about the gromacs.org_gmx-users mailing list