[Fwd: Re: [gmx-users] mdrun CVS version crashes instantly when run across nodes in parallel]
Erik Brandt
erikb at theophys.kth.se
Wed Jan 23 12:12:27 CET 2008
Hi Carsten and Berk,
Your fixes did the trick. Man, there is some great scaling on the CVS
version now! Thanks a lot for the help!
/ Erik
ons 2008-01-23 klockan 09:20 +0100 skrev Berk Hess:
>
>
> ----------------------------------------
> > Date: Tue, 22 Jan 2008 20:15:03 +0100
> > From: ckutzne at gwdg.de
> > To: gmx-users at gromacs.org
> > Subject: [Fwd: Re: [gmx-users] mdrun CVS version crashes instantly when run across nodes in parallel]
> >
> > Hi Erik,
> >
> > I have made a test with today's CVS version and I also run into the
> > problem you described. It happens as soon as one uses more than one node
> > and at the same time more than one process per node.
> >
> > The problem seems to be in gmx_sumd where in the two-step summation in a
> > call to MPI_Allreduce the variable cr->nc.comm_inter happens to be a
> > NULL pointer, which should clearly not be.
> >
> > The inter-node communicator is freed in gmx_setup_nodecomm (network.c,
> > line 393) if an intra-node communicator is present - I do not understand
> > why the communicator is freed here.
> >
> > Maybe Berk can help us on that? If I comment out the MPI_Comm_free the
> > code runs happily - haven't checked the results, though.
> >
> > Carsten
>
> Ah, there is a "typo" in the conditional, it should be rank_intra iso comm_intra.
> I have committed the fix.
>
> Thanks,
>
> Berk.
>
> _________________________________________________________________
> Express yourself instantly with MSN Messenger! Download today it's FREE!
> http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
--
Erik Brandt <erikb at theophys.kth.se>
KTH
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20080123/cd4484a3/attachment.html>
More information about the gromacs.org_gmx-users
mailing list