Subject: Re: Re: [gmx-users] Gromacs 4 bug?
patrick fuchs
patrick.fuchs at univ-paris-diderot.fr
Thu Jan 15 08:37:16 CET 2009
Hi all,
finally we (Berk and I) could find that there is a problem with
lam-7.1.4 under Fedora9/Fedora10. Initially I thought it affected only
gromacs-4 but a PhD student of my lab reported identical problems with
gromacs-3.3 (hanging problems), while under FC8 I had no problem at all
with the same hardware. So if you want to run gromacs-4 (or any version)
under FC9/FC10, the fix I tested and that works is to use openmpi as an
alternative to lam-7.1.4 (I only tested the last version openmpi-1.2.8).
I didn't test other versions of lam (7.0.?) but it seems that the
developers advice to switch to openmpi. So for the two other users
(Bernhard and Antoine) who reported identical problems to the mailing
list (see
http://www.gromacs.org/pipermail/gmx-users/2008-December/038594.html and
http://www.gromacs.org/pipermail/gmx-users/2008-December/038623.html)
can you please check out that it works on your hardware using openmpi?
Hope it helps,
Patrick
Berk Hess a écrit :
> Hi,
>
> We have for now concluded that this is probably an issue related to
> lam7.1.4.
>
> There were a few other users with mdrun crashes/hangs.
> What it the status of your problems?
>
> Berk
>
>
> > Date: Tue, 13 Jan 2009 13:02:47 +0100
> > From: patrick.fuchs at univ-paris-diderot.fr
> > To: gmx-users at gromacs.org
> > Subject: Re: Subject: Re: Re: [gmx-users] Gromacs 4 bug?
> >
> > Hi Berk,
> > it hangs after approximatively 45000 steps (the system is a simple DLPC
> > bilayer), and there was a cpt file that has been generated (but it was
> > generated [09:48] before it started to hang [9:58]) :
> > ---------
> > [fuchs at cumin 2]$ ls -ltrh
> > [snip]
> > -rw-r--r-- 1 fuchs dsimb 384K janv. 13 09:33 traj.trr
> > -rw-r--r-- 1 fuchs dsimb 385K janv. 13 09:48 state.cpt
> > -rw-r--r-- 1 fuchs dsimb 66K janv. 13 09:57 md.log
> > -rw-r--r-- 1 fuchs dsimb 5,4M janv. 13 09:58 traj.xtc
> > -rw-r--r-- 1 fuchs dsimb 92K janv. 13 09:58 ener.edr
> > [fuchs at cumin 2]$ date
> > Tue Jan 13 10:16:22 CET 2009
> > ---------
> > The version of MPI is: LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University.
> > So shall I send you the tpr and cpt files off list ?
> > Ciao,
> >
> > Patrick
> >
> > Berk Hess a écrit :
> > > Hi,
> > >
> > > This is strange.
> > > You run on 4 nodes and all processes hang at the same MPI call.
> > > I see no reason why they should hang if they are all at the correct
> call.
> > >
> > > After how many steps does this happen?
> > > If it is not much I can try to see if it also hangs on our system.
> > > Otherwise, could you try to generate a checkpoint file with
> > > which it hangs quickly?
> > >
> > > What version of MPI are you using?
> > >
> > > Berk
> > >
> > >
> > > > Date: Tue, 13 Jan 2009 10:53:25 +0100
> > > > From: patrick.fuchs at univ-paris-diderot.fr
> > > > To: gmx-users at gromacs.org
> > > > Subject: Re: Subject: Re: Re: [gmx-users] Gromacs 4 bug?
> > > >
> > > > Hi Berk,
> > > > I did a test on gromacs-4.0.2 under Fedora 10 (with fftw-3.0.1 and
> > > > lam-7.1.4), using a slightly upgraded version of gcc compared to my
> > > > previous post (gcc version 4.3.2 20081105 (Red hat 4.3.2-7)) on
> the same
> > > > hardware but it still hangs (so both FC9 and FC10 give the same
> problem,
> > > > while FC8 does not). Finally I could test mdrun_mpi in the
> debugger and
> > > > here are the results of my tests. You were right, it seems that mdrun
> > > > hangs at an MPI call, here are the outputs of each xterm:
> > > >
> > > > XTERM1
> > > > ===================================================================
> > > > GNU gdb Fedora (6.8-29.fc10)
> > > > Copyright (C) 2008 Free Software Foundation, Inc.
> > > > License GPLv3+: GNU GPL version 3 or later
> > > > <http://gnu.org/licenses/gpl.html>
> > > > This is free software: you are free to change and redistribute it.
> > > > There is NO WARRANTY, to the extent permitted by law. Type "show
> copying"
> > > > and "show warranty" for details.
> > > > This GDB was configured as "x86_64-redhat-linux-gnu"...
> > > > (gdb) run
> > > > Starting program: /usr/local/gromacs-4.0.2/bin/mdrun_mpi
> > > > [Thread debugging using libthread_db enabled]
> > > > [New Thread 0x12df30 (LWP 8285)]
> > > > NNODES=4, MYRANK=0, HOSTNAME=cumin.dsimb.inserm.fr
> > > > NODEID=0 argc=1
> > > > :-) G R O M A C S (-:
> > > >
> > > > Giant Rising Ordinary Mutants for A Clerical Setup
> > > >
> > > > :-) VERSION 4.0.2 (-:
> > > >
> > > > [snip]
> > > >
> > > > starting mdrun 'Pure DLPC bilayer with 128 lipids and 3655 SPC water'
> > > > 5000000 steps, 10000.0 ps.
> > > > ^C
> > > > Program received signal SIGINT, Interrupt.
> > > > 0x0000003b978cc087 in sched_yield () from /lib64/libc.so.6
> > > > Missing separate debuginfos, use: debuginfo-install
> > > > e2fsprogs-libs-1.41.3-2.fc10.x86_64 glibc-2.9-3.x86_64
> > > > libICE-1.0.4-4.fc10.x86_64 libSM-1.1.0-2.fc10.x86_64
> > > > libX11-1.1.4-6.fc10.x86_64 libXau-1.0.4-1.fc10.x86_64
> > > > libXdmcp-1.0.2-6.fc10.x86_64 libxcb-1.1.91-5.fc10.x86_64
> > > > (gdb) where
> > > > #0 0x0000003b978cc087 in sched_yield () from /lib64/libc.so.6
> > > > #1 0x0000000000770c83 in lam_ssi_rpi_usysv_proc_read_env ()
> > > > #2 0x0000000000784a39 in lam_ssi_rpi_usysv_advance_common ()
> > > > #3 0x000000000074a1e0 in _mpi_req_advance ()
> > > > #4 0x000000000073ced0 in lam_send ()
> > > > #5 0x000000000075328e in MPI_Send ()
> > > > #6 0x000000000074d7ec in MPI_Sendrecv ()
> > > > #7 0x00000000004aebfd in gmx_sum_qgrid_dd ()
> > > > #8 0x00000000004b40bb in gmx_pme_do ()
> > > > #9 0x0000000000479a58 in do_force_lowlevel ()
> > > > #10 0x00000000004d1d32 in do_force ()
> > > > #11 0x00000000004214d2 in do_md ()
> > > > #12 0x000000000041bea0 in mdrunner ()
> > > > #13 0x0000000000422b94 in main ()
> > > > (gdb)
> > > > ===================================================================
> > > >
> > > >
> > > > XTERM2
> > > > ===================================================================
> > > > GNU gdb Fedora (6.8-29.fc10)
> > > > Copyright (C) 2008 Free Software Foundation, Inc.
> > > > License GPLv3+: GNU GPL version 3 or later
> > > > <http://gnu.org/licenses/gpl.html>
> > > > This is free software: you are free to change and redistribute it.
> > > > There is NO WARRANTY, to the extent permitted by law. Type "show
> copying"
> > > > and "show warranty" for details.
> > > > This GDB was configured as "x86_64-redhat-linux-gnu"...
> > > > (gdb) run
> > > > Starting program: /usr/local/gromacs-4.0.2/bin/mdrun_mpi
> > > > [Thread debugging using libthread_db enabled]
> > > > [New Thread 0x12df30 (LWP 8294)]
> > > > NNODES=4, MYRANK=1, HOSTNAME=cumin.dsimb.inserm.fr
> > > > NODEID=1 argc=1
> > > > ^C
> > > > Program received signal SIGINT, Interrupt.
> > > > 0x0000003b978cc087 in sched_yield () from /lib64/libc.so.6
> > > > Missing separate debuginfos, use: debuginfo-install
> > > > e2fsprogs-libs-1.41.3-2.fc10.x86_64 glibc-2.9-3.x86_64
> > > > libICE-1.0.4-4.fc10.x86_64 libSM-1.1.0-2.fc10.x86_64
> > > > libX11-1.1.4-6.fc10.x86_64 libXau-1.0.4-1.fc10.x86_64
> > > > libXdmcp-1.0.2-6.fc10.x86_64 libxcb-1.1.91-5.fc10.x86_64
> > > > (gdb) where
> > > > #0 0x0000003b978cc087 in sched_yield () from /lib64/libc.so.6
> > > > #1 0x0000000000770c83 in lam_ssi_rpi_usysv_proc_read_env ()
> > > > #2 0x0000000000784a39 in lam_ssi_rpi_usysv_advance_common ()
> > > > #3 0x000000000074a1e0 in _mpi_req_advance ()
> > > > #4 0x000000000073ea90 in MPI_Wait ()
> > > > #5 0x000000000074d800 in MPI_Sendrecv ()
> > > > #6 0x00000000004aed44 in gmx_sum_qgrid_dd ()
> > > > #7 0x00000000004b40bb in gmx_pme_do ()
> > > > #8 0x0000000000479a58 in do_force_lowlevel ()
> > > > #9 0x00000000004d1d32 in do_force ()
> > > > #10 0x00000000004214d2 in do_md ()
> > > > #11 0x000000000041bea0 in mdrunner ()
> > > > #12 0x0000000000422b94 in main ()
> > > > (gdb)
> > > > ===================================================================
> > > >
> > > >
> > > > XTERM3
> > > > ===================================================================
> > > > GNU gdb Fedora (6.8-29.fc10)
> > > > Copyright (C) 2008 Free Software Foundation, Inc.
> > > > License GPLv3+: GNU GPL version 3 or later
> > > > <http://gnu.org/licenses/gpl.html>
> > > > This is free software: you are free to change and redistribute it.
> > > > There is NO WARRANTY, to the extent permitted by law. Type "show
> copying"
> > > > and "show warranty" for details.
> > > > This GDB was configured as "x86_64-redhat-linux-gnu"...
> > > > (gdb) run
> > > > Starting program: /usr/local/gromacs-4.0.2/bin/mdrun_mpi
> > > > [Thread debugging using libthread_db enabled]
> > > > [New Thread 0x12df30 (LWP 8276)]
> > > > NNODES=4, MYRANK=2, HOSTNAME=cumin.dsimb.inserm.fr
> > > > NODEID=2 argc=1
> > > > ^C
> > > > Program received signal SIGINT, Interrupt.
> > > > 0x0000000000770c70 in lam_ssi_rpi_usysv_proc_read_env ()
> > > > Missing separate debuginfos, use: debuginfo-install
> > > > e2fsprogs-libs-1.41.3-2.fc10.x86_64 glibc-2.9-3.x86_64
> > > > libICE-1.0.4-4.fc10.x86_64 libSM-1.1.0-2.fc10.x86_64
> > > > libX11-1.1.4-6.fc10.x86_64 libXau-1.0.4-1.fc10.x86_64
> > > > libXdmcp-1.0.2-6.fc10.x86_64 libxcb-1.1.91-5.fc10.x86_64
> > > > (gdb) where
> > > > #0 0x0000000000770c70 in lam_ssi_rpi_usysv_proc_read_env ()
> > > > #1 0x0000000000784a39 in lam_ssi_rpi_usysv_advance_common ()
> > > > #2 0x000000000074a1e0 in _mpi_req_advance ()
> > > > #3 0x000000000073ced0 in lam_send ()
> > > > #4 0x000000000075328e in MPI_Send ()
> > > > #5 0x000000000074d7ec in MPI_Sendrecv ()
> > > > #6 0x00000000004aed44 in gmx_sum_qgrid_dd ()
> > > > #7 0x00000000004b40bb in gmx_pme_do ()
> > > > #8 0x0000000000479a58 in do_force_lowlevel ()
> > > > #9 0x00000000004d1d32 in do_force ()
> > > > #10 0x00000000004214d2 in do_md ()
> > > > #11 0x000000000041bea0 in mdrunner ()
> > > > #12 0x0000000000422b94 in main ()
> > > > (gdb)
> > > > ===================================================================
> > > >
> > > >
> > > > XTERM4
> > > > ===================================================================
> > > > GNU gdb Fedora (6.8-29.fc10)
> > > > Copyright (C) 2008 Free Software Foundation, Inc.
> > > > License GPLv3+: GNU GPL version 3 or later
> > > > <http://gnu.org/licenses/gpl.html>
> > > > This is free software: you are free to change and redistribute it.
> > > > There is NO WARRANTY, to the extent permitted by law. Type "show
> copying"
> > > > and "show warranty" for details.
> > > > This GDB was configured as "x86_64-redhat-linux-gnu"...
> > > > (gdb) run
> > > > Starting program: /usr/local/gromacs-4.0.2/bin/mdrun_mpi
> > > > [Thread debugging using libthread_db enabled]
> > > > [New Thread 0x12df30 (LWP 8267)]
> > > > NNODES=4, MYRANK=3, HOSTNAME=cumin.dsimb.inserm.fr
> > > > NODEID=3 argc=1
> > > > ^C
> > > > Program received signal SIGINT, Interrupt.
> > > > 0x0000000000770c70 in lam_ssi_rpi_usysv_proc_read_env ()
> > > > Missing separate debuginfos, use: debuginfo-install
> > > > e2fsprogs-libs-1.41.3-2.fc10.x86_64 glibc-2.9-3.x86_64
> > > > libICE-1.0.4-4.fc10.x86_64 libSM-1.1.0-2.fc10.x86_64
> > > > libX11-1.1.4-6.fc10.x86_64 libXau-1.0.4-1.fc10.x86_64
> > > > libXdmcp-1.0.2-6.fc10.x86_64 libxcb-1.1.91-5.fc10.x86_64
> > > > (gdb) where
> > > > #0 0x0000000000770c70 in lam_ssi_rpi_usysv_proc_read_env ()
> > > > #1 0x0000000000784a39 in lam_ssi_rpi_usysv_advance_common ()
> > > > #2 0x000000000074a1e0 in _mpi_req_advance ()
> > > > #3 0x000000000073ea90 in MPI_Wait ()
> > > > #4 0x000000000074d800 in MPI_Sendrecv ()
> > > > #5 0x00000000004aebfd in gmx_sum_qgrid_dd ()
> > > > #6 0x00000000004b40bb in gmx_pme_do ()
> > > > #7 0x0000000000479a58 in do_force_lowlevel ()
> > > > #8 0x00000000004d1d32 in do_force ()
> > > > #9 0x00000000004214d2 in do_md ()
> > > > #10 0x000000000041bea0 in mdrunner ()
> > > > #11 0x0000000000422b94 in main ()
> > > > (gdb)
> > > > ===================================================================
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Patrick
> > > >
> > >
> > >
> > >
> ------------------------------------------------------------------------
> > > Express yourself instantly with MSN Messenger! MSN Messenger
> > > <http://clk.atdmt.com/AVE/go/onm00200471ave/direct/01/>
> > >
> > >
> > >
> ------------------------------------------------------------------------
> > >
> > > _______________________________________________
> > > gmx-users mailing list gmx-users at gromacs.org
> > > http://www.gromacs.org/mailman/listinfo/gmx-users
> > > Please search the archive at http://www.gromacs.org/search before
> posting!
> > > Please don't post (un)subscribe requests to the list. Use the
> > > www interface or send it to gmx-users-request at gromacs.org.
> > > Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> >
> > --
> > _________________________________________________________________
> > !!!! new E-mail address: patrick.fuchs at univ-paris-diderot.fr !!!!
> > !!!! new postal address !!!
> > Patrick FUCHS
> > Equipe de Bioinformatique Genomique et Moleculaire
> > INTS, INSERM UMR-S726, Université Paris Diderot,
> > 6 rue Alexandre Cabanel, 75015 Paris
> > Tel : +33 (0)1-44-49-30-57 - Fax : +33 (0)1-47-34-74-31
> > Web Site: http://www.dsimb.inserm.fr/~fuchs
> > _______________________________________________
> > gmx-users mailing list gmx-users at gromacs.org
> > http://www.gromacs.org/mailman/listinfo/gmx-users
> > Please search the archive at http://www.gromacs.org/search before
> posting!
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>
> ------------------------------------------------------------------------
> What can you do with the new Windows Live? Find out
> <http://www.microsoft.com/windows/windowslive/default.aspx>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> gmx-users mailing list gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
--
_________________________________________________________________
!!!! new E-mail address: patrick.fuchs at univ-paris-diderot.fr !!!!
!!!! new postal address !!!
Patrick FUCHS
Equipe de Bioinformatique Genomique et Moleculaire
INTS, INSERM UMR-S726, Université Paris Diderot,
6 rue Alexandre Cabanel, 75015 Paris
Tel : +33 (0)1-44-49-30-57 - Fax : +33 (0)1-47-34-74-31
Web Site: http://www.dsimb.inserm.fr/~fuchs
More information about the gromacs.org_gmx-users
mailing list