[gmx-users] Re: parallel simulation crash on 6 processors
servaas michielssens
servaas.michielssens at student.kuleuven.be
Thu Nov 29 17:25:36 CET 2007
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://www.gromacs.org/pipermail/gmx-users/attachments/20071128/e80a1638/attachment-0001.html
>
> ------------------------------
>
> Message: 5
> Date: Wed, 28 Nov 2007 14:39:29 +0100
> From: David van der Spoel <spoel at xray.bmc.uu.se>
> Subject: Re: [gmx-users] parallel simulation crash on 6 processors
> To: Discussion list for GROMACS users <gmx-users at gromacs.org>
> Message-ID: <474D6F91.3040203 at xray.bmc.uu.se>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> servaas michielssens wrote:
> > I tried to run a gromacs simulation (gromacs 3.3.1, MD, 18000 atoms) on
> > 2 systems:
> >
> > Intel(R) Pentium(R) CPU 2.40GHz with 100Mbit network
> > and
> > AMD Opteron(tm) Processor 250 with 1Gbit network
> > On both systems I had a crash when I tried to run with more then 5
> > processors. From 1-5 there was no problem.
> >
> more details please.
> >
I ran same the simulation on 1,2,3,4 and 5 processors without any
problem, so I there is no problem with the system that I'm using. But
from to moment I tried to use 6 processors of the same cluster the
simulation crashes, this is the error:
Error on node 0, will try to stop all the nodes
Halting parallel program mdrun on CPU 0 out of 6
On AMD:
[0] MPI Abort by user Aborting program !
[0] Aborting program!
p4_error: latest msg from perror: No such file or directory
p0_3303: p4_error: : -1
Killed by signal 2.^M
Killed by signal 2.^M
Killed by signal 2.^M
Killed by signal 2.^M
Killed by signal 2.^M
p0_3303: (1.088153) net_send: could not write to fd=4, errno = 32
error while executing run nb 1
On intel:
p4_1781: p4_error: Timeout in establishing connection to remote
process: 0
rm_l_4_1786: (318.577125) net_send: could not write to fd=5, errno = 32
p4_1781: (318.580132) net_send: could not write to fd=5, errno = 32
p0_26458: (319.239545) net_recv failed for fd = 8
p0_26458: p4_error: net_recv read, errno = : 104
Killed by signal 2.^M
Killed by signal 2.^M
Killed by signal 2.^M
Killed by signal 2.^M
Killed by signal 2.^M
p0_26458: (325.249810) net_send: could not write to fd=4, errno = 32
error while executing run nb 1
hope this is the information you need,
greets,
servaas
> > kind regards,
> >
> > servaas michielssens
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > gmx-users mailing list gmx-users at gromacs.org
> > http://www.gromacs.org/mailman/listinfo/gmx-users
> > Please search the archive at http://www.gromacs.org/search before posting!
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>
>
> --
> David.
> ________________________________________________________________________
> David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group,
> Dept. of Cell and Molecular Biology, Uppsala University.
> Husargatan 3, Box 596, 75124 Uppsala, Sweden
> phone: 46 18 471 4205 fax: 46 18 511 755
> spoel at xray.bmc.uu.se spoel at gromacs.org http://folding.bmc.uu.se
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
More information about the gromacs.org_gmx-users
mailing list