[gmx-users] multiple processes of a gromacs tool requiring user action at runtime on one Cray XC30 node using aprun

Rashmi rashkush at gmail.com
Thu Oct 29 12:53:22 CET 2015


Hi,

As written on the website, g_mmpbsa does not directly support MPI. g_mmpbsa
does not include any code concerning OpenMP and MPI. However, We have tried
to interface MPI and OpenMP functionality of APBS by some mechanism.

One may use g_mmpbsa with MPI by following: (1) allocate number of
processors through queue management system, (2) define APBS environment
variable (export APBS="mpirun -np 8 apbs") that includes all required
flags, then start g_mmpbsa directly without using mpirun (or any similar
program). If queue management system specifically requires aprun/mpirun for
execution of program, g_mmpbsa might not work in this case.

One may use g_mmpbsa with OpenMP by following: (1)  allocate number of
threads through queue management system, (2) define OMP_NUM_THREADS
variable for allocated number of threads and (3) execute g_mmpbsa.

We have not tested simultaneous use of both MPI and OpenMP, so we do not
know that it will work.

Concerning standard input for g_mmpbsa, if echo or <<EOF .. .. .. <<EOF is
not working. One may try using a file as following:

​export
 OMP_NUM_THREADS
​=​
8

aprun -n 1 -N 1 -d 8 g_mmpbsa -f traj.xtc -s topol.tpr -n index.ndx -i
mmpbsa.mdp <input_index

Here, input_index contains group numbers in separate line and last line
should be empty.
$ cat input_index
1
13

​​

Concerning, 1800 directories, you may write a shell script to automate job
submission by going into each directory, start a g_mmpbsa process (or
submit job script) and then move to next directory.

Hope this information would be helpful.


Thanks.



On Thu, Oct 29, 2015 at 12:01 PM, Vedat Durmaz <durmaz at zib.de> wrote:

>
> hi again,
>
> 3 answers are hidden somewhere below ..
>
>
> Am 28.10.2015 um 15:45 schrieb Mark Abraham:
>
>> Hi,
>>
>> On Wed, Oct 28, 2015 at 3:19 PM Vedat Durmaz <durmaz at zib.de> wrote:
>>
>>
>>> Am 27.10.2015 um 23:57 schrieb Mark Abraham:
>>>
>>>> Hi,
>>>>
>>>>
>>>> On Tue, Oct 27, 2015 at 11:39 PM Vedat Durmaz <durmaz at zib.de> wrote:
>>>>
>>>> hi mark,
>>>>>
>>>>> many thanks. but can you be a little more precise? the author's only
>>>>> hint regarding mpi is on this site
>>>>> "http://rashmikumari.github.io/g_mmpbsa/How-to-Run.html" and related
>>>>> to
>>>>> APBS. g_mmpbsa itself doesn't understand openmp/mpi afaik.
>>>>>
>>>>> the error i'm observing is occurring pretty much before apbs is
>>>>> started.
>>>>> to be honest, i can't see any link to my initial question ...
>>>>>
>>>>> It has the sentence "Although g_mmpbsa does not support mpirun..."
>>>> aprun
>>>>
>>> is
>>>
>>>> a form of mpirun, so I assumed you knew that what you were trying was
>>>> actually something that could work, which would therefore have to be
>>>> with
>>>> the APBS back end. The point of what it says there is that you don't run
>>>> g_mmpbsa with aprun, you tell it how to run APBS with aprun. This just
>>>> avoids the problem entirely because your redirected/interactive input
>>>>
>>> goes
>>>
>>>> to a single g_mmpbsa as normal, which then launches APBS with MPI
>>>>
>>> support.
>>>
>>>> Tool authors need to actively write code to be useful with MPI, so
>>>> unless
>>>> you know what you are doing is supposed to work with MPI because they
>>>> say
>>>> it works, don't try.
>>>>
>>>> Mark
>>>>
>>> you are right. it's apbs which ought to run in parallel mode. of course,
>>> i can set the variable 'export APBS="mpirun -np 8 apbs"' [or set 'export
>>> OMP_NUM_THREADS=8'] if i want to split a 24 cores-node to let's say 3
>>> independent g_mmpbsa processes. the problem is that i must start
>>> g_mmpbsa itself with aprun (in the script run_mmpbsa.sh).
>>>
>>
>> No. Your job runs a shell script on your compute node. It can do anything
>> it likes, but it would make sense to run something in parallel at some
>> point. You need to build a g_mmpbsa that you can just run in a shell
>> script
>> that echoes in the input (try that on its own first). Then you use the
>> above approach so that the single process that is g_mmpbsa does the call
>> to
>> aprun (which is the cray mpirun) to run APBS in MPI mode.
>>
>> It is likely that even if you run g_mmpbsa with aprun and solve the input
>> issue somewhow, the MPI runtime will refuse to start the child APBS with
>> aprun, because nesting is typically unsupported (and your current command
>> lines haven't given it enough information to do a good job even if it is
>> supported).
>>
>
> yes, i've encountered issues with nested aprun calls. so this will hardly
> work i guess.
>
>
>> i absolutely
>>> cannot see any other way of running apbs when using it out of g_mmpbs.
>>> hence, i need to run
>>>
>>> aprun -n 3 -N 3 -cc 0-7:8-15:16-23 ../run_mmpbsa.sh
>>>
>>> This likely starts three copies of g_mmpbsa each of which expect terminal
>> input, which maybe you can teach aprun to manage, but then each g_mmpbsa
>> will then do its own APBS and this is completely not what you want.
>>
>
> hmm, to be honest, i would say this is exactly what i'm trying to achieve.
> isn't it? i want 3 independent g_mmpbsa runs each of which executed in
> another directory with its own APBS. by the way, all together i have 1800
> such directories each containing another trajectory.
>
> if someone is ever (within the next 20 hours!) able to figure out a
> solution for this purpose, i would be absolutely pleased.
>
>
> and of course i'm aware about having given 8 cores to g_mmpbsa, hoping
>>> that it is able to read my input and to run apbs which hopefully uses
>>> all of the 8 cores. the user input (choosing protein, then ligand),
>>> however, "Cannot [be] read". this issue occurs quite early during the
>>> g_mmpbsa process and therefore has nothing to do with the apbs (either
>>> with openmp or mpi) functionality which is launched later.
>>>
>>> if i simulate the whole story (spreading 24 cores of a node over 3
>>> processes) using a bash script (instead of g_mmpbsa) which just expects
>>> (and prints) the two inputs during runtime and which i start three times
>>> on one node, everything works fine. i'm just asking myself whether
>>> someone knows why gromacs fails under the same conditions and whether it
>>> is possible to remedy that problem.
>>>
>>
>> By the way, GROMACS isn't failing. You're using a separately provided
>> program, so you should really be talking to its authors for help. ;-)
>>
>> mpirun -np 3 gmx_mpi make_ndx
>>
>> would work fine (though not usefully), if you use the mechanisms provided
>> by mpirun to control how the redirection to the stdin of the child
>> processes should work. But handling that redirection is an issue between
>> you and the docs of your mpirun :-)
>>
>> Mark
>>
>
> unfortunately, there is only very few information about stdin redirection
> associated with aprun. what i've done now is modifying g_mmpbsa such that
> no user input is required. starting
>
> aprun -n 3 -N 3 -cc 0-7:8-15:16-23  ../run_mmpbsa.sh
>
> where, using the $ALPS_APP_PE variable, i successfully enter three
> directories (dir_1, dir_2, dir_3, all containing identical file names) and
> start g_mmpbsa in each of them. now what happens is that all the new files
> are generated in the first of the 3 folders (while the two others are not
> affected at all). and all new files are generated 3 times (file, #file1#,
> #file2#) in a manner which is typical for gromacs' backup philosophy. so on
> some (hardware?) level the data of the 3 processes are not well separated.
> the supervisors of our HPC system were not able to figure out the reasons
> so far. that's why i'm trying to find help here from someone that was
> successful in sharing computing nodes in a similar way.
>
> anyways. thanks for your time so far.
>
>
>
>
>>
>>
>>>> Am 27.10.2015 um 22:43 schrieb Mark Abraham:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I think if you check out how the g_mmpbsa author intends you to use
>>>>>> MPI
>>>>>> with the tool, your problem goes away.
>>>>>> http://rashmikumari.github.io/g_mmpbsa/Usage.html
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>> On Tue, Oct 27, 2015 at 10:10 PM Vedat Durmaz <durmaz at zib.de> wrote:
>>>>>>
>>>>>> hi guys,
>>>>>>>
>>>>>>> I'm struggling with the use of diverse gromacs commands on a Cray
>>>>>>> XC30
>>>>>>> system. actually, it's about the external tool g_mmpbsa which
>>>>>>> requires
>>>>>>> user action during runtime. i get similar errors with other Gromacs
>>>>>>> tools, e.g., make_ndx, though, i know that it doesn't make sense to
>>>>>>>
>>>>>> use
>>>
>>>> more than one core for make_ndx. however, g_mmpsa (or rather apbs used
>>>>>>> by g_mmpbsa) is supposed to be capable of multiple cores using
>>>>>>> openmp.
>>>>>>> so, as long as i assign all of the 24 cores of a computing node to
>>>>>>> one
>>>>>>> process through
>>>>>>>
>>>>>>> aprun -n 1 ../run_mmpbsa.sh
>>>>>>>
>>>>>>> everthing works fine. user input is accepted either interactively, by
>>>>>>> using the echo command, or through a here construction (""... << EOF
>>>>>>>
>>>>>> ...
>>>
>>>> EOF). however, as soon as I try to split the 24 cores of a node to
>>>>>>> multiple processes (more than one) using for instance
>>>>>>>
>>>>>>> aprun -n 3 -N 3 -cc 0-7:8-15:16-23 ../run_mmpbsa.sh
>>>>>>>
>>>>>>> (and OMP_NUM_THREADS=8), there is neither an occasion to feed with
>>>>>>>
>>>>>> user
>>>
>>>> input in the interactive mode nor it is recognized through echo/here
>>>>>>>
>>>>>> in
>>>
>>>> the script. instead, i get the error
>>>>>>>
>>>>>>>     >> Source code file: .../gromacs-4.6.7/src/gmxlib/index.c, line:
>>>>>>>
>>>>>> 1192
>>>
>>>>     >> Fatal error:
>>>>>>>     >> Cannot read from input
>>>>>>>
>>>>>>> where, according to the source code, "scanf" malfunctions. when i
>>>>>>> use,
>>>>>>> for comparison purposes, make_ndx that i would like to feed with "q"
>>>>>>> i
>>>>>>> observe a similar error:
>>>>>>>
>>>>>>>     >>Source code file: .../gromacs-4.6.7/src/tools/gmx_make_ndx.c,
>>>>>>>
>>>>>> line:
>>>
>>>> 1219
>>>>>
>>>>>>     >>Fatal error:
>>>>>>>     >>Error reading user input
>>>>>>>
>>>>>>> here, it's "fgets" which is malfunctioning.
>>>>>>>
>>>>>>> does anyone have an idea what this could be caused by? what do i need
>>>>>>>
>>>>>> to
>>>
>>>> consider/change in order to be able to start more than process on one
>>>>>>> computing node?
>>>>>>>
>>>>>>> thanks in advance
>>>>>>>
>>>>>>> vedat
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gromacs Users mailing list
>>>>>>>
>>>>>>> * Please search the archive at
>>>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>>>> posting!
>>>>>>>
>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>
>>>>>>> * For (un)subscribe requests visit
>>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>>>>>>> or
>>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>>>
>>>>>>> --
>>>>> Gromacs Users mailing list
>>>>>
>>>>> * Please search the archive at
>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>> posting!
>>>>>
>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>
>>>>> * For (un)subscribe requests visit
>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>
>>>>> --
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at
>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>> posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> send a mail to gmx-users-request at gromacs.org.
>>>
>>>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>



-- 
With Regards,

Rashmi Kumari
PhD Student
School of Computational and Integrative Sciences
Jawaharlal Nehru University
New Delhi- 110067.


More information about the gromacs.org_gmx-users mailing list