[gmx-developers] threads are now ON by default

Alexey Shvetsov alexxyum at gmail.com
Sat Feb 13 14:08:40 CET 2010


Hi,
Yes its on the same place. Looks like there is some kind of race conditions.
Input files available via ftp://alexxy.gentoo.ru/pub/gmx/

speptide.tar.bz2 contains finished run with md (leap frog)
speptide-vv.tar.bz2 contains crushed run with md-vv
i get deadlock near step 14k of md run without restrains

On Суббота 13 февраля 2010 00:23:51 Michael Shirts wrote:
> Hi, Alexey-
> 
> Thanks for tracking this down.  md-vv is still getting the kinks
> worked out.  Is this in the same place as the bug you were seeing a
> couple of days ago, or a different place?
> 
> Sander, perhaps if you could check for non-threadsafeness (looks like
> its in write_traj) since you're a bit more familiar -- if you can't
> see it quickly, please let me know, and I'll try to track it down!
> 
> Best,
> Michael
> 
> > Date: Fri, 12 Feb 2010 23:21:03 +0300
> > From: Alexey Shvetsov <alexxyum at gmail.com>
> > Subject: Re: [gmx-developers] threads are now ON by default
> > To: Discussion list for GROMACS development
> >        <gmx-developers at gromacs.org>
> > Message-ID: <201002122321.15014.alexxyum at gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> > 
> > On Пятница 12 февраля 2010 19:32:46 Sander Pronk wrote:
> >> Now that the last issues have been resolved with the threading code,
> >> thread-based parallelization has been turned on by default. To disable
> >> all the threading code, use
> >> 
> >> --disable-threads.
> >> 
> >> in configure, or turn the option GMX_THREADS off with ccmake.
> >> 
> >> Running mdrun with just one thread (the default) is almost exactly the
> >> same as running it without threading code: the only thing that's
> >> different is that the few remaining global variables are protected by
> >> mutexes.
> >> 
> >> Performance-wise mdrun runs very slightly faster with threads than with
> >> OpenMPI when Nthreads<=Ncores (and there is no other processes on the
> >> computer). When Nthreads>Ncores (or other processes are running), the
> >> thread code is much faster than OpenMPI, but the total runtime is still
> >> smaller than when Nthreads==Ncores.
> >> 
> >> If there's any problems in getting things running, or with performance,
> >> I'd very much like to hear about it.
> >> 
> >> Sander
> > 
> > Good news.
> > but looks like i get deadlock when running gromacs with 4 threads (intel
> > core i5 750) with md-vv integrator. I can share all input files. Same
> > system with almost same input parameters except integrator (md) runs
> > fine with 4 threads backtrace
> > 0x00002b9c9fd9dd87 in sched_yield () at
> > ../sysdeps/unix/syscall-template.S:82 82    
> >  ../sysdeps/unix/syscall-template.S: No such file or directory. in
> > ../sysdeps/unix/syscall-template.S
> > (gdb) bt
> > #0  0x00002b9c9fd9dd87 in sched_yield () at ../sysdeps/unix/syscall-
> > template.S:82
> > #1  0x00002b9c9f5aeb5d in tMPI_Gather (sendbuf=0x4, sendcount=<value
> > optimized out>, sendtype=<value optimized out>, recvbuf=<value optimized
> > out>, recvcount=<value optimized out>, recvtype=<value optimized out>,
> > root=0, comm=0xee2160)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/gmxlib/thread_mpi/gather.c:9
> > 8 #2  0x00002b9c9f17885e in dd_gather (dd=<value optimized out>,
> > nbytes=1607464840, src=0x1290748, dest=0xffffffffffffffff)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/domdec_network.c:233
> > #3  0x00002b9c9f16f841 in dd_collect_cg (dd=<value optimized out>,
> > state_local=0x12d3600, lv=<value optimized out>, v=0x2b9cb4001010)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/domdec.c:1263
> > #4  dd_collect_vec (dd=<value optimized out>, state_local=0x12d3600,
> > lv=<value optimized out>, v=0x2b9cb4001010)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/domdec.c:1420
> > #5  0x00002b9c9f170df9 in dd_collect_state (dd=0x1287df0,
> > state_local=0x12d3600, state=0x2b9ca83782b0)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/domdec.c:1474
> > #6  0x00002b9c9f1d0b7b in write_traj (fplog=<value optimized out>,
> > cr=0x2b9ca8377ad0, fp_trn=<value optimized out>, bX=-1, bV=8, bF=0,
> > fp_xtc=-1, bXTC=0,
> >    xtc_prec=1000, fn_cpt=0xee2490 "speptide.md.cpt", bCPT=1,
> > top_global=0x2b9ca83780b0, eIntegrator=10, simulation_part=1, step=14690,
> >    t=29.379999999999999, state_local=0x12d3600,
> > state_global=0x2b9ca83782b0, f_local=0x12d8b00, f_global=0x2b9cb4c02010,
> > n_xtc=0x7fff5fd02398, x_xtc=0x7fff5fd02320) at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/stat.c:473
> > #7  0x0000000000414318 in do_md (fplog=<value optimized out>, cr=<value
> > optimized out>, nfile=<value optimized out>, fnm=<value optimized out>,
> >    oenv=<value optimized out>, bVerbose=<value optimized out>,
> > bCompact=1, nstglobalcomm=1, vsite=0x0, constr=0x129b5d0, stepout=100,
> > ir=0x2b9ca8377b40, top_global=0x2b9ca83780b0, fcd=0x1287c60,
> > state_global=0x2b9ca83782b0, mdatoms=0x1296de0, nrnb=0x1290b90,
> > wcycle=0x1290800, ed=0x0, fr=0x1291200, repl_ex_nst=0, repl_ex_seed=-1,
> > cpt_period=<value optimized out>, max_hours=<value optimized out>,
> > Flags=7168, runtime=0x7fff5fd02710) at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/md.c:1943
> > #8  0x000000000040f155 in mdrunner (fplog=0xee1c50, cr=0x2b9ca8377ad0,
> > nfile=<value optimized out>, fnm=<value optimized out>, oenv=<value
> > optimized out>,
> >    bVerbose=<value optimized out>, bCompact=1, nstglobalcomm=-1,
> > ddxyz=0x7fff5fd02844, dd_node_order=1, rdd=<value optimized out>,
> >    rconstr=<value optimized out>, dddlb_opt=0x41f64d "auto",
> > dlb_scale=<value optimized out>, ddcsx=0x0, ddcsy=0x0, ddcsz=0x0,
> > nstepout=100, resetstep=-1, nmultisim=0, repl_ex_nst=0, repl_ex_seed=-1,
> > pforce=<value optimized out>, cpt_period=<value optimized out>,
> > max_hours=<value optimized out>, Flags=7168) at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/runner.c:669
> > #9  0x0000000000410168 in mdrunner_start_fn (arg=<value optimized out>)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/runner.c:170
> > #10 0x00002b9c9f5af8d4 in tMPI_Thread_starter (arg=<value optimized out>)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/gmxlib/thread_mpi/tmpi_init.
> > c:360 #11 0x00002b9c9f5afc04 in tMPI_Init_fn (N=19466056,
> > start_function=<value optimized out>, arg=<value optimized out>)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/gmxlib/thread_mpi/tmpi_init.
> > c:472 #12 0x000000000040ff19 in mdrunner_threads (nthreads=4,
> > fplog=<value optimized out>, cr=<value optimized out>, nfile=<value
> > optimized out>,
> >    fnm=<value optimized out>, oenv=<value optimized out>, bVerbose=1,
> > bCompact=1, nstglobalcomm=-1, ddxyz=0x7fff5fd04b10, dd_node_order=1,
> >    rdd=<value optimized out>, rconstr=<value optimized out>,
> > dddlb_opt=0x41f64d "auto", dlb_scale=<value optimized out>, ddcsx=0x0,
> > ddcsy=0x0, ddcsz=0x0,
> >    nstepout=100, resetstep=-1, nmultisim=0, repl_ex_nst=0,
> > repl_ex_seed=-1, pforce=<value optimized out>, cpt_period=<value
> > optimized out>,
> >    max_hours=<value optimized out>, Flags=7168) at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/runner.c:238
> > #13 0x0000000000419a1b in main (argc=6, argv=0x7fff5fd04ce8) at
> > /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/mdrun.c:519
> > Current language:  auto
> > The current source language is "auto; currently asm".
> > (gdb) up
> > #1  0x00002b9c9f5aeb5d in tMPI_Gather (sendbuf=0x4, sendcount=<value
> > optimized out>, sendtype=<value optimized out>, recvbuf=<value optimized
> > out>, recvcount=<value optimized out>, recvtype=<value optimized out>,
> > root=0, comm=0xee2160)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/gmxlib/thread_mpi/gather.c:9
> > 8 98                      TMPI_YIELD_WAIT(cur);
> > Current language:  auto
> > The current source language is "auto; currently c".
> > (gdb) up
> > #2  0x00002b9c9f17885e in dd_gather (dd=<value optimized out>,
> > nbytes=1607464840, src=0x1290748, dest=0xffffffffffffffff)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/domdec_network.c:233
> > 233         MPI_Gather(src,nbytes,MPI_BYTE,
> > (gdb) up
> > #3  0x00002b9c9f16f841 in dd_collect_cg (dd=<value optimized out>,
> > state_local=0x12d3600, lv=<value optimized out>, v=0x2b9cb4001010)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/domdec.c:1263
> > 1263        dd_gather(dd,2*sizeof(int),buf2,ibuf);
> > (gdb) up
> > #4  dd_collect_vec (dd=<value optimized out>, state_local=0x12d3600,
> > lv=<value optimized out>, v=0x2b9cb4001010)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/domdec.c:1420
> > 1420        dd_collect_cg(dd,state_local);
> > (gdb) up
> > #5  0x00002b9c9f170df9 in dd_collect_state (dd=0x1287df0,
> > state_local=0x12d3600, state=0x2b9ca83782b0)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/domdec.c:1474
> > 1474                  
> >  dd_collect_vec(dd,state_local,state_local->x,state-
> > 
> >>x);
> >>
> > (gdb) up
> > #6  0x00002b9c9f1d0b7b in write_traj (fplog=<value optimized out>,
> > cr=0x2b9ca8377ad0, fp_trn=<value optimized out>, bX=-1, bV=8, bF=0,
> > fp_xtc=-1, bXTC=0,
> >    xtc_prec=1000, fn_cpt=0xee2490 "speptide.md.cpt", bCPT=1,
> > top_global=0x2b9ca83780b0, eIntegrator=10, simulation_part=1, step=14690,
> >    t=29.379999999999999, state_local=0x12d3600,
> > state_global=0x2b9ca83782b0, f_local=0x12d8b00, f_global=0x2b9cb4c02010,
> > n_xtc=0x7fff5fd02398, x_xtc=0x7fff5fd02320) at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/mdlib/stat.c:473
> > 473                 dd_collect_state(cr->dd,state_local,state_global);
> > (gdb) up
> > #7  0x0000000000414318 in do_md (fplog=<value optimized out>, cr=<value
> > optimized out>, nfile=<value optimized out>, fnm=<value optimized out>,
> >    oenv=<value optimized out>, bVerbose=<value optimized out>,
> > bCompact=1, nstglobalcomm=1, vsite=0x0, constr=0x129b5d0, stepout=100,
> > ir=0x2b9ca8377b40, top_global=0x2b9ca83780b0, fcd=0x1287c60,
> > state_global=0x2b9ca83782b0, mdatoms=0x1296de0, nrnb=0x1290b90,
> > wcycle=0x1290800, ed=0x0, fr=0x1291200, repl_ex_nst=0, repl_ex_seed=-1,
> > cpt_period=<value optimized out>, max_hours=<value optimized out>,
> > Flags=7168, runtime=0x7fff5fd02710) at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/md.c:1943
> > 1943                write_traj(fplog,cr,fp_trn,bX,bV,bF,fp_xtc,bXTC,ir-
> > 
> >>xtcprec,
> >>
> > (gdb) up
> > #8  0x000000000040f155 in mdrunner (fplog=0xee1c50, cr=0x2b9ca8377ad0,
> > nfile=<value optimized out>, fnm=<value optimized out>, oenv=<value
> > optimized out>,
> >    bVerbose=<value optimized out>, bCompact=1, nstglobalcomm=-1,
> > ddxyz=0x7fff5fd02844, dd_node_order=1, rdd=<value optimized out>,
> >    rconstr=<value optimized out>, dddlb_opt=0x41f64d "auto",
> > dlb_scale=<value optimized out>, ddcsx=0x0, ddcsy=0x0, ddcsz=0x0,
> > nstepout=100, resetstep=-1, nmultisim=0, repl_ex_nst=0, repl_ex_seed=-1,
> > pforce=<value optimized out>, cpt_period=<value optimized out>,
> > max_hours=<value optimized out>, Flags=7168) at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/runner.c:669
> > 669             integrator[inputrec->eI].func(fplog,cr,nfile,fnm,
> > (gdb) up
> > #9  0x0000000000410168 in mdrunner_start_fn (arg=<value optimized out>)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/runner.c:170
> > 170         mda->ret=mdrunner(fplog, cr, mc.nfile, mc.fnm, mc.oenv,
> > mc.bVerbose,
> > (gdb) up
> > #10 0x00002b9c9f5af8d4 in tMPI_Thread_starter (arg=<value optimized out>)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/gmxlib/thread_mpi/tmpi_init.
> > c:360 360             th->start_fn(th->start_arg);
> > (gdb) up
> > #11 0x00002b9c9f5afc04 in tMPI_Init_fn (N=19466056, start_function=<value
> > optimized out>, arg=<value optimized out>)
> >    at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/gmxlib/thread_mpi/tmpi_init.
> > c:472 472             tMPI_Start_threads(N, 0, 0, start_function, arg);
> > (gdb) up
> > #12 0x000000000040ff19 in mdrunner_threads (nthreads=4, fplog=<value
> > optimized out>, cr=<value optimized out>, nfile=<value optimized out>,
> >    fnm=<value optimized out>, oenv=<value optimized out>, bVerbose=1,
> > bCompact=1, nstglobalcomm=-1, ddxyz=0x7fff5fd04b10, dd_node_order=1,
> >    rdd=<value optimized out>, rconstr=<value optimized out>,
> > dddlb_opt=0x41f64d "auto", dlb_scale=<value optimized out>, ddcsx=0x0,
> > ddcsy=0x0, ddcsz=0x0,
> >    nstepout=100, resetstep=-1, nmultisim=0, repl_ex_nst=0,
> > repl_ex_seed=-1, pforce=<value optimized out>, cpt_period=<value
> > optimized out>,
> >    max_hours=<value optimized out>, Flags=7168) at /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/runner.c:238
> > 238             tMPI_Init_fn(nthreads, mdrunner_start_fn, (void*)(&mda)
> > ); (gdb) up
> > #13 0x0000000000419a1b in main (argc=6, argv=0x7fff5fd04ce8) at
> > /var/tmp/portage/sci-
> > chemistry/gromacs-9999/work/gromacs-9999/src/kernel/mdrun.c:519
> > 519       rc = mdrunner_threads(nthreads,
> > (gdb) up
> > 
> > 
> > --
> > Best Regards,
> > Alexey 'Alexxy' Shvetsov
> > Petersburg Nuclear Physics Institute, Russia
> > Department of Molecular and Radiation Biophysics
> > Gentoo Team Ru
> > Gentoo Linux Dev
> > mailto:alexxyum at gmail.com
> > mailto:alexxy at gentoo.org
> > mailto:alexxy at omrb.pnpi.spb.ru
> > -------------- next part --------------
> > A non-text attachment was scrubbed...
> > Name: not available
> > Type: application/pgp-signature
> > Size: 198 bytes
> > Desc: This is a digitally signed message part.
> > Url :
> > http://lists.gromacs.org/pipermail/gmx-developers/attachments/20100212/1
> > 59ee41f/attachment.bin
> > 
> > ------------------------------
> > 
> > --
> > gmx-developers mailing list
> > gmx-developers at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-developers
> > 
> > 
> > End of gmx-developers Digest, Vol 70, Issue 8
> > *********************************************

-- 
Best Regards,
Alexey 'Alexxy' Shvetsov
Petersburg Nuclear Physics Institute, Russia
Department of Molecular and Radiation Biophysics
Gentoo Team Ru
Gentoo Linux Dev
mailto:alexxyum at gmail.com
mailto:alexxy at gentoo.org
mailto:alexxy at omrb.pnpi.spb.ru
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20100213/9a7e5208/attachment.sig>


More information about the gromacs.org_gmx-developers mailing list