[gmx-users] compilation problems orte error

Jennifer Williams Jennifer.Williams at ed.ac.uk
Wed Feb 10 12:39:42 CET 2010


Sorry for the delay in replying back. I start the job using the  
following script file:

#$ -S /bin/bash
#$ -l h_rt=47:59:00
#$ -j y
#$ -pe mpich2 8
#$ -cwd
cd /home/jwillia4/GRO/gromacs-4.0.7/JJW_003/PH_TORUN
/home/jwillia4/GRO/bin/mpirun -np 8 /home/jwillia4/GRO/bin/mdrun_mpi  
-v -s md.tpr

The strange thing is that sometimes it works and the job runs to  
completion and sometimes it crashes immediately with the orte error so  
I know that it is not the input files causing the problems. It seems  
entirely random.

Has it to do with the -pe mpich2 8 line? I was previously using Open  
MPI installed on the cluster for common use but now have downloaded  
everything into my home directory. The script has been adapted from  
the time when I didn't have my own OpenMPI in my home directory.  
Perhaps it needs further alteration but I don't know what.

How would I do about checking whether MPI is running?

If you spot anything suspicious in the above commands please let me know.

Thanks

Jenny


Quoting Chandan Choudhury <iitdckc at gmail.com>:

> As Justin said give the command line options for mdrun and also check that
> your mpi environment is running.  Better to run a parallel job and check its
> output.
>
> Chadnan
>
> --
> Chandan kumar Choudhury
> NCL, Pune
> INDIA
>
>
> On Mon, Feb 8, 2010 at 8:02 PM, Justin A. Lemkul <jalemkul at vt.edu> wrote:
>
>>
>>
>> Jennifer Williams wrote:
>>
>>>
>>> Dear All,
>>>
>>> I am having problems compiling gromacs 4.0.7 in parallel. I am following
>>> the
>>> Quick and Dirty Installation instructions on the gromacs webpage.
>>> I downloaded the the versions of fftw, OpenMPI and gromacs-4.0.7 following
>>> these instructions.
>>>
>>> Everything seems to compile OK and I get all the serial executables
>>> including mdrun written to my bin directory and they seem to run fine.
>>> However when I try to run mdrun_mpi on 6 nodes I get the following:
>>>
>>> [vlxbig16:08666] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>> runtime/orte_init_stage1.c at line 182
>>> [vlxbig16:08667] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>> runtime/orte_init_stage1.c at line 182
>>> [vlxbig16:08700] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>> runtime/orte_init_stage1.c at line 182
>>> [vlxbig16:08670] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>> runtime/orte_init_stage1.c at line 182
>>> [vlxbig16:08681] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>> runtime/orte_init_stage1.c at line 182
>>> [vlxbig16:08659] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>> runtime/orte_init_stage1.c at line 182
>>> --------------------------------------------------------------------------
>>> It looks like orte_init failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during orte_init; some of which are due to configuration or
>>> environment problems.  This failure appears to be an internal failure;
>>> here's some additional information (which may only be relevant to an
>>> Open MPI developer):
>>>
>>>  orte_rml_base_select failed
>>>  --> Returned value -13 instead of ORTE_SUCCESS
>>>
>>>
>>> Does anyone have any idea what is causing this? Computer support at my
>>> University is not sure.
>>>
>>>
>> How are you launching mdrun_mpi (command line)?
>>
>> -Justin
>>
>>
>>> Thanks
>>>
>>>
>>>
>>>
>> --
>> ========================================
>>
>> Justin A. Lemkul
>> Ph.D. Candidate
>> ICTAS Doctoral Scholar
>> MILES-IGERT Trainee
>> Department of Biochemistry
>> Virginia Tech
>> Blacksburg, VA
>> jalemkul[at]vt.edu | (540) 231-9080
>> http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
>>
>> ========================================
>> --
>> gmx-users mailing list    gmx-users at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>> Please search the archive at http://www.gromacs.org/search before posting!
>> Please don't post (un)subscribe requests to the list. Use the www interface
>> or send it to gmx-users-request at gromacs.org.
>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>>
>



Dr. Jennifer Williams
Institute for Materials and Processes
School of Engineering
University of Edinburgh
Sanderson Building
The King's Buildings
Mayfield Road
Edinburgh, EH9 3JL, United Kingdom
Phone: ++44 (0)131 650 4 861


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.





More information about the gromacs.org_gmx-users mailing list