[gmx-users] Re: parallel simulation crash on 6 processors

Carsten Kutzner ckutzne at gwdg.de
Thu Nov 29 18:29:49 CET 2007


Hi Servaas,

I often had similar problems when running on mpich-1.2.x. In my case
they all vanished when I was using any other MPI implementation, like
LAM, OpenMPI, or mpich-2.x.

Carsten


servaas michielssens wrote:
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: http://www.gromacs.org/pipermail/gmx-users/attachments/20071128/e80a1638/attachment-0001.html
>>
>> ------------------------------
>>
>> Message: 5
>> Date: Wed, 28 Nov 2007 14:39:29 +0100
>> From: David van der Spoel <spoel at xray.bmc.uu.se>
>> Subject: Re: [gmx-users] parallel simulation crash on 6 processors
>> To: Discussion list for GROMACS users <gmx-users at gromacs.org>
>> Message-ID: <474D6F91.3040203 at xray.bmc.uu.se>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> servaas michielssens wrote:
>>> I tried to run a gromacs simulation (gromacs 3.3.1, MD, 18000 atoms) on 
>>> 2 systems:
>>>  
>>> Intel(R) Pentium(R) CPU 2.40GHz with 100Mbit network
>>> and
>>> AMD Opteron(tm) Processor 250 with 1Gbit network
>>> On both systems I had a crash when I tried to run with more then 5 
>>> processors. From 1-5 there was no problem.
>>>  
>> more details please.
>>>  
> 
> I ran same the simulation on 1,2,3,4 and 5 processors without any
> problem, so I there is no problem with the system that I'm using. But
> from to moment I tried to use 6 processors of the same cluster the
> simulation crashes, this is the error:
> 
> Error on node 0, will try to stop all the nodes
> Halting parallel program mdrun on CPU 0 out of 6
> 
> 
> On AMD:
> [0] MPI Abort by user Aborting program !
> [0] Aborting program!
>     p4_error: latest msg from perror: No such file or directory
> p0_3303:  p4_error: : -1
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> p0_3303: (1.088153) net_send: could not write to fd=4, errno = 32
> error while executing run nb 1
> 
> 
> On intel:
> p4_1781:  p4_error: Timeout in establishing connection to remote
> process: 0
> rm_l_4_1786: (318.577125) net_send: could not write to fd=5, errno = 32
> p4_1781: (318.580132) net_send: could not write to fd=5, errno = 32
> p0_26458: (319.239545) net_recv failed for fd = 8
> p0_26458:  p4_error: net_recv read, errno = : 104
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> p0_26458: (325.249810) net_send: could not write to fd=4, errno = 32
> error while executing run nb 1
> 
> 
> 
> hope this is the information you need,
> 
> greets,
> 
> servaas
> 
>>> kind regards,
>>>  
>>> servaas michielssens
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> gmx-users mailing list    gmx-users at gromacs.org
>>> http://www.gromacs.org/mailman/listinfo/gmx-users
>>> Please search the archive at http://www.gromacs.org/search before posting!
>>> Please don't post (un)subscribe requests to the list. Use the 
>>> www interface or send it to gmx-users-request at gromacs.org.
>>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>>
>> -- 
>> David.
>> ________________________________________________________________________
>> David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group,
>> Dept. of Cell and Molecular Biology, Uppsala University.
>> Husargatan 3, Box 596,  	75124 Uppsala, Sweden
>> phone:	46 18 471 4205		fax: 46 18 511 755
>> spoel at xray.bmc.uu.se	spoel at gromacs.org   http://folding.bmc.uu.se
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
> 
> 
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php



More information about the gromacs.org_gmx-users mailing list