[gmx-developers] mpi and thread command line options

Berk Hess hess at kth.se
Wed Jul 11 19:05:43 CEST 2012


On 07/11/2012 06:51 PM, Christoph Junghans wrote:
> 2012/7/11 Roland Schulz <roland at utk.edu>:
>> On Wed, Jul 11, 2012 at 7:19 AM, Alexey Shvetsov
>> <alexxy at omrb.pnpi.spb.ru> wrote:
>>> Roland Schulz писал 2012-07-11 10:47:
>>>> On Wed, Jul 11, 2012 at 2:33 AM, Alexey Shvetsov
>>>> <alexxy at omrb.pnpi.spb.ru> wrote:
>>>>> Hi!
>>>>>
>>>>> mvpich2/mvapich (as well as its derviations like platform
>>>>> mpi,pc-mpi,intel-mpi) also will behave differently. So user can get
>>>>> cryptic message about launching mpd, in case of launching mdrun -np
>>>>> directly
>>>> Not quite. mpich2 requires for MPI_Comm_spawn to work that the
>>>> application is run with mpirun/mpiexec. See
>>>>
>>>> http://lists.mcs.anl.gov/pipermail/mpich-discuss/2012-June/012638.html
>>>> for the details. We would need to detect that and don't try to spawn
>>>> in that case (and run in serial with a warning).
>>>> Thus mpich2 would require: mpirun mdrun -np x. Of course that isn't
>>>> more convenient than mpirun -np x mdrun. The only advantage would be
>>>> that as with tmpi "mpirun mdrun" would automatically use as many
>>>> cores
>>>> as are available and are useful for the system, whereas without spawn
>>>> the user needs to decides the number of cores and we can't have any
>>>> automatic mechanism helping the user.
>>>>
>>>> Roland
>>> Ok. But how this will work with batch systems that automaticaly send
>>> number of processes to mpiexec, effective launch command will be
>>>
>>> $ mpiexec mdrun_mpi $mdrunargs
>> The idea was to only do any spawn if mdrun_mpi is started in serial
>> (mpiexec -n 1). It was only meant to make mdrun_mpi behave the same as
>> tmpi mdrun for a single node. On clusters with batch system nothing
>> would have changed over the current situation.
> 1.) I agree with Roland that keeping the -nt option would be
> misleading, even if -nt gets a new meaning - "number of tasks", where
> tasks can be threads or mpi tasks.
> Also from my experience, half of the users are not aware of the -nt
> option either, they just start mdrun without any special setting,
> which means "guess" and we should keep that.
> So making -nt obsolete is not bad, I think.
In my last mail the proposition was to only use -nt for the total number 
of threads
and not with MPI. I think this is useful, since the user usually would 
want to limit
the total number of threads (or cores used) and would not know what to 
choose
for -ntmpi and -ntomp.
>
> 2.) One class of system that has not been discussed yet, are the one
> with different number of OMP threads per node (like the Intel MIC),
> that should also be possible by explicitly defining OMP_NUM_THREADS on
> each node.
How doesn MIC have different numbers of threads per node?
>
> 3.) With all these thread options and combination, we will need
> something like g_tune_mdrun, which I guess could be an extension of
> g_tune_pme.
Except for the scaling limit, there is usually once best choice.
But for running close to the scaling limit a g_tune_mdrun could indeed help.

Cheers,

Berk
>
> Christoph
>
>
>
>> Roland
>>
>>>
>>>
>>>>> Roland Schulz писал 2012-07-11 05:09:
>>>>>> On Tue, Jul 10, 2012 at 8:09 PM, Szilárd Páll
>>>>>> <szilard.pall at cbr.su.se> wrote:
>>>>>>> On Tue, Jul 10, 2012 at 11:15 PM, Berk Hess <hess at kth.se> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> We are working on the final part of the 4.6 release, which is
>>>>>>>> making the MPI
>>>>>>>> and OpenMP thread setup automated, fully checked and user
>>>>>>>> friendly.
>>>>>>>> We have to decide on the naming of the options.
>>>>>>>> Roland has an implementation of mpi spawn ready. This would allow
>>>>>>>> to do
>>>>>>>> mdrun -np #processes instead of using mpirun (at least with
>>>>>>>> openmpi).
>>>>>>> Would this feature add anything but the convenience of being able
>>>>>>> to
>>>>>>> run without mpirun on a single node? Without MPI spawning working
>>>>>>> reliably in most cases (or with the ability to detect with a high
>>>>>>> certainty when it does not), enabling an -np mdrun option would
>>>>>>> just
>>>>>>> lead to confusion when mdrun exits with cryptic MPI error due to
>>>>>>> not
>>>>>>> being able to spawn.
>>>>>> The idea was to make mdrun behave the same whether it is compiled
>>>>>> with
>>>>>> real MPI or tMPI. Thus also only support a single node. But MPICH
>>>>>> is
>>>>>> behaving quite stupid and they also don't seem to care. And only
>>>>>> supporting it for OpenMPI is probably also more confusing then
>>>>>> helpful
>>>>>> (then tmpi+OpenMPI would behave the same but MPICH/MVAPICH would
>>>>>> behave different). So you are probably right that it is better to
>>>>>> not
>>>>>> add spawn at all.
>>>>>>
>>>>>>> Therefore, I'd be OK with a new *hidden* -np option that only
>>>>>>> works
>>>>>>> in
>>>>>>> single-node case, but not with a non-hidden one advertised in the
>>>>>>> documentation/wiki.
>>>>>> As a hidden option it would only help for testing. But I don't
>>>>>> think
>>>>>> it is worth adding it for just that.
>>>>>>
>>>>>> Roland
>>>>> --
>>>>> Best Regards,
>>>>> Alexey 'Alexxy' Shvetsov
>>>>> Petersburg Nuclear Physics Institute, NRC Kurchatov Institute,
>>>>> Gatchina, Russia
>>>>> Department of Molecular and Radiation Biophysics
>>>>> Gentoo Team Ru
>>>>> Gentoo Linux Dev
>>>>> mailto:alexxyum at gmail.com
>>>>> mailto:alexxy at gentoo.org
>>>>> mailto:alexxy at omrb.pnpi.spb.ru
>>>>> --
>>>>> gmx-developers mailing list
>>>>> gmx-developers at gromacs.org
>>>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>>>> Please don't post (un)subscribe requests to the list. Use the
>>>>> www interface or send it to gmx-developers-request at gromacs.org.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
>>>> 865-241-1537, ORNL PO BOX 2008 MS6309
>>> --
>>> Best Regards,
>>> Alexey 'Alexxy' Shvetsov
>>> Petersburg Nuclear Physics Institute, NRC Kurchatov Institute,
>>> Gatchina, Russia
>>> Department of Molecular and Radiation Biophysics
>>> Gentoo Team Ru
>>> Gentoo Linux Dev
>>> mailto:alexxyum at gmail.com
>>> mailto:alexxy at gentoo.org
>>> mailto:alexxy at omrb.pnpi.spb.ru
>>> --
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>> Please don't post (un)subscribe requests to the list. Use the
>>> www interface or send it to gmx-developers-request at gromacs.org.
>>>
>>>
>>>
>>
>>
>> --
>> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
>> 865-241-1537, ORNL PO BOX 2008 MS6309
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
>
>





More information about the gromacs.org_gmx-developers mailing list