[gmx-developers] MPI stall?

Michael Shirts mrshirts at gmail.com
Thu Dec 17 00:50:20 CET 2009


Thanks for the tips -- I'll investigate.

On Sun, Dec 13, 2009 at 6:00 AM,  <gmx-developers-request at gromacs.org> wrote:
> Send gmx-developers mailing list submissions to
>        gmx-developers at gromacs.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://lists.gromacs.org/mailman/listinfo/gmx-developers
> or, via email, send a message with subject or body 'help' to
>        gmx-developers-request at gromacs.org
>
> You can reach the person managing the list at
>        gmx-developers-owner at gromacs.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gmx-developers digest..."
>
>
> Today's Topics:
>
>   1. Re: MPI stall? (Roland Schulz)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 12 Dec 2009 19:13:11 -0500
> From: Roland Schulz <roland at utk.edu>
> Subject: Re: [gmx-developers] MPI stall?
> To: Discussion list for GROMACS development
>        <gmx-developers at gromacs.org>
> Message-ID:
>        <c93c21390912121613k60ff943dif9faba70bfa465ba at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi,
>
> print_time writes to stderr so if it really get stuck in there I would think
> it has to do with wrong stderr redirection. Could you verify that it really
> is stuck on the head node by trying to step in the debugger? Also try to
> change where the stderr is written to.
>
> Roland
>
> On Sun, Dec 6, 2009 at 10:03 AM, Michael Shirts <michael.shirts at virginia.edu
>> wrote:
>
>> Hi, all-
>>
>> I'm getting a weird MPI stall with the git master repository version.
>> I compiled with with debugging on and double precision, running on a 8
>> processor MacPro.
>> After running for 10 min or so parallelized 8 ways, it appears to
>> stall.  Attaching a debugger to the threads to see where it's stuck,
>> the backtrace on the head node was (removing arguments for clarity)
>>
>> #0  0x907fb29a in write$NOCANCEL$UNIX2003 ()
>> #1  0x907fb1f2 in _swrite ()
>> #2  0x907fb11f in __sflush ()
>> #3  0x907ffcfc in __swbuf ()
>> #4  0x90838e92 in fputc ()
>> #5  0x000c2dfd in print_time (out=0xa00c7690, runtime=0xbfffd5e0,
>> step=44600, ir=0x1017e00, cr=0x9004e0) at sim_util.c:164
>> #6  0x00019215 in do_md  at md.c:2316
>> #7  0x00013138 in mdrunner  at md.c:216
>> #9  0x0001b9cc in main (argc=14, argv=0xbffff3a0) at mdrun.c:518
>>
>> And for the other nodes;
>>
>> #0  0x907c536a in swtch_pri ()
>> #1  0x90832e65 in sched_yield ()
>> #2  0x00a05515 in mca_pml_ob1_send ()
>> #3  0x00710445 in MPI_Sendrecv ()
>> #4  0x00048fe4 in dd_sendrecv_rvec (dd=0x91dc00, ddimind=0,
>> direction=1, buf_s=0x1034c00, n_s=333, buf_r=0xd22f38, n_r=360) at
>> domdec_network.c:115
>> #5  0x00029c32 in dd_move_x (dd=0x91dc00, box=0x9260fc, x=0xd21000) at
>> domdec.c:657
>> #6  0x000c3f77 in do_force  at sim_util.c:521
>> #7  0x00017478 in do_md  at md.c:1794
>> #8  0x00013138 in mdrunner at md.c:687
>> #9  0x00011cbb in mdrunner_threads  at md.c:216
>> #10 0x0001b9cc in main (argc=14, argv=0x9184e0) at mdrun.c:518
>>
>> Any other observations of this?  Has this been seen on other MacPros?
>> With debugging on?
>>
>> Best,
>> ~~~~~~~~~~~~
>> Michael Shirts
>> Assistant Professor
>> Department of Chemical Engineering
>> University of Virginia
>> michael.shirts at virginia.edu
>> (434)-243-1821
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
>>
>
>
>
> --
> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
> 865-241-1537, ORNL PO BOX 2008 MS6309
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://lists.gromacs.org/pipermail/gmx-developers/attachments/20091212/901452c2/attachment-0001.html
>
> ------------------------------
>
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>
>
> End of gmx-developers Digest, Vol 68, Issue 8
> *********************************************
>



More information about the gromacs.org_gmx-developers mailing list