[gmx-users] GROMACS with Torque: Job stuck

Mark Abraham mark.j.abraham at gmail.com
Fri Aug 25 18:07:17 CEST 2017


Hi,

Simple echo jobs don't tell us that the MPI environment is correctly
initialized, because they do not attempt to communicate.

If you're using the MPI wrapper compiler, and that works for other MPI
programs, then GROMACS will work also. That's what the wrapper compiler is
for :-)

Mark

On Fri, Aug 25, 2017 at 6:01 PM Souparno Adhikary <souparnoa91 at gmail.com>
wrote:

> I checked the server and scheduler logs. They are as follows:
>
> server log:
>
> 08/25/2017 13:49:31.230;256;PBS_Server.25216;Job;11.headnode;enqueuing
> into batc
> h, state 1 hop 1
> 08/25/2017 13:49:31.230;08;PBS_Server.25216;Job;perform_commit_work;job_id:
> 11.headnode
> 08/25/2017 13:49:31.230;02;PBS_Server.25216;node;close_conn;Closing
> connection 8 and calling its accompanying function on close
> 08/25/2017 13:49:31.360;08;PBS_Server.25134;Job;11.headnode;Job
> Modified at request of root at headnode
> 08/25/2017 13:49:31.361;08;PBS_Server.25134;Job;11.headnode;Job Run at
> request of root at headnode
> 08/25/2017 13:49:31.374;13;PBS_Server.25134;Job;11.headnode;Not
> sending email: User does not want mail of this type.
> 08/25/2017 13:50:59.137;02;PBS_Server.25119;Svr;PBS_Server;Torque
> Server Version = 6.1.1.1, loglevel = 0
>
> Scheduler log:
>
> 08/25/2017 13:49:31.373;64; pbs_sched.25166;Job;11.headnode;Job Run
>
> The tracejob command is giving me the following output:
>
>
> Job: 11.headnode
>
> 08/25/2017 13:49:31.230 S    enqueuing into batch, state 1 hop 1
> 08/25/2017 13:49:31.360 S    Job Modified at request of root at headnode
> 08/25/2017 13:49:31.373 L    Job Run
> 08/25/2017 13:49:31.361 S    Job Run at request of root at headnode
> 08/25/2017 13:49:31.374 S    Not sending email: User does not want
> mail of this type.
> 08/25/2017 13:49:31  A    queue=batch
> 08/25/2017 13:49:31  A    user=souparno group=souparno jobname=asyn
> queue=batch ctime=1503649171 qtime=1503649171 etime=1503649171
>                           start=1503649171 owner=souparno at headnode
> exec_host=headnode3/0-1+headnode2/0-1 Resource_List.nodes=2:ppn=2
>                           Resource_List.walltime=120:00:00
> Resource_List.nodect=2 Resource_List.neednodes=2:ppn=2
>
> @Mark, we are using mpich2-1.4.1p1 at every node. And the GROMACS
> version is 5.1.4. Even, at every node, mpi and GROMACS are installed
> in the same location i. e. /usr/local/
>
> I tried running simple "echo" jobs and they are running successfully.
> That's why it seems to me that it might be a problem with integrating
> GROMACS.
>
>
> Souparno Adhikary,
> CHPC Lab,
> Department of Microbiology,
> University of Calcutta.
>
> On Fri, Aug 25, 2017 at 7:56 PM, Mark Abraham <mark.j.abraham at gmail.com>
> wrote:
>
> > Hi,
> >
> > Sounds like an issue with some infrastructure being mismatched (e.g.
> > different MPI at compile and run time). Does it work when you run a
> > different MPI-used program, compiled and run using the same approach?
> >
> > Mark
> >
> > On Fri, Aug 25, 2017 at 3:25 PM Souparno Adhikary <souparnoa91 at gmail.com
> >
> > wrote:
> >
> > > Hi,
> > >
> > > This is related to running GROMACS through mpirun and Torque. After I
> > > installed Torque in our lab cluster, I went on to test GROMACS using a
> > > simple run. I wrote a Torque script as follows:
> > >
> > > #!/bin/sh
> > > #PBS -N asyn
> > > #PBS -q batch
> > > #PBS -l nodes=4:ppn=4
> > > #PBS -l walltime=120:00:00
> > > cd $PBS_O_WORKDIR
> > > cat $PBS_NODEFILE>nodes
> > > mpirun -np 16 gmx_mpi mdrun -deffnm asyn_10ns
> > >
> > > After I submit this file using qsub command, the job is properly
> > > showing using pbsnodes command in other mom units.
> > >
> > > But, the job stays as R in 00:00:00 in the list. I presume, this might
> > > be an error from my part in setting up GROMACS with Torque. I compiled
> > > GROMACS with -DGMX_MPI=ON
> > >
> > > The nodes file is not written and the job stays stuck.
> > >
> > > Can anyone familiar with job scheduling help me in this regard???
> > >
> > >
> > > Souparno Adhikary,
> > > CHPC Lab,
> > > Department of Microbiology,
> > > University of Calcutta.
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-request at gromacs.org.
> > >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at http://www.gromacs.org/
> > Support/Mailing_Lists/GMX-Users_List before posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list