[gmx-users] Re: submission error

Mark Abraham Mark.Abraham at anu.edu.au
Thu Dec 15 07:03:12 CET 2011


On 15/12/2011 4:51 PM, aiswarya pawar wrote:
>
> Hi,
>
>
> When i tried running mdrun without mip i received the same error ie
>
> when i gave mdrun -deffnm md
>
> i got=
> Back Off! I just backed up md.log to ./#md.log.1#
> Getting Loaded...
> Reading file md.tpr, VERSION 4.5.4 (single precision)
> Loaded with Money
>
> p0_23443:  p4_error: interrupt SIGx: 4

The job you ran below with poe did not have the MPI suffix, so I am not 
convinced you are doing appropriate things with appropriate executables. 
Above, a p4_error is consistent with an MPI-enabled executable.

Get an MPI "hello world" program and get it running before worrying 
about GROMACS. Follow the user guide for your cluster. Get your 
MPI-enabled mdrun suffixed with _mpi like the installation guides recommend.

Mark

>
>
> Thanks,
> Aiswarya
>
> On Wed, Dec 14, 2011 at 5:00 PM, aiswarya pawar 
> <aiswarya.pawar at gmail.com <mailto:aiswarya.pawar at gmail.com>> wrote:
>
>     Hi users,
>
>     I have a submission script for gromacs mdrun to be used on IBM
>     cluster, but i get an error while running it. the script goes like
>     this=
>
>     #!/bin/sh
>     # @ error   = job1.$(Host).$(Cluster).$(Process).err
>     # @ output  = job1.$(Host).$(Cluster).$(Process).out
>     # @ class = ptask32
>     # @ job_type = parallel
>     # @ node = 1
>     # @ tasks_per_node = 4
>     # @ queue
>
>     echo "_____________________________________"
>     echo "LOADL_STEP_ID=$LOADL_STEP_ID"
>     echo "_____________________________________"
>
>     machine_file="/tmp/machinelist.$LOADL_STEP_ID"
>     rm -f $machine_file
>     for node in $LOADL_PROCESSOR_LIST
>     do
>     echo $node >> $machine_file
>     done
>     machine_count=`cat /tmp/machinelist.$LOADL_STEP_ID|wc -l`
>     echo $machine_count
>     echo MachineList:
>     cat /tmp/machinelist.$LOADL_STEP_ID
>     echo "_____________________________________"
>     unset LOADLBATCH
>     env  |grep LOADLBATCH
>     cd /home/staff/1adf/
>     /usr/bin/poe /home/gromacs-4.5.5/bin/mdrun -deffnm
>     /home/staff/1adf/md -procs $machine_count -hostfile
>     /tmp/machinelist.$LOADL_STEP_ID
>     rm /tmp/machinelist.$LOADL_STEP_ID
>
>
>     i get an out file as=
>     _____________________________________
>     LOADL_STEP_ID=cnode39.97541.0
>     _____________________________________
>     4
>     MachineList:
>     cnode62
>     cnode7
>     cnode4
>     cnode8
>     _____________________________________
>     p0_25108:  p4_error: interrupt SIGx: 4
>     p0_2890:  p4_error: interrupt SIGx: 4
>     p0_2901:  p4_error: interrupt SIGx: 15
>     p0_22760:  p4_error: interrupt SIGx: 15
>
>
>     an error file =
>
>     Reading file /home/staff/1adf/md.tpr, VERSION 4.5.4 (single precision)
>     Sorry couldn't backup /home/staff/1adf/md.log to
>     /home/staff/1adf/#md.log.14#
>
>     Back Off! I just backed up /home/staff/1adf/md.log to
>     /home/staff/1adf/#md.log.14#
>     ERROR: 0031-300  Forcing all remote tasks to exit due to exit code
>     1 in task 0
>
>     Please anyone can help with this error.
>
>     Thanks
>
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20111215/6a29e0cc/attachment.html>


More information about the gromacs.org_gmx-users mailing list