[gmx-users] mdrun crash when -np 8, not when -np 4

David van der Spoel spoel at xray.bmc.uu.se
Fri Aug 22 09:08:01 CEST 2003


On Fri, 2003-08-22 at 04:35, Malcolm Gillies wrote:
> I have an mdrun job which crashes when I attempt to run it over 8
> processors, but which appears to run fine with 4 processors. Any
> suggestions?
Could it be that you have no water left on processor 0 (which is where I
presume that the problem occurs)?

This is a known bug, which has not been resolved completely yet.

> 
> I'm running Gromacs 3.1.4 on Alpha system. The mdp file and log files
> are attached (I'm using PME).
> 
> The stack trace on crash:
> 
> prun: /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi (pid 13747743) killed by signal 11 (SIGSEGV)
> prun: generating backtrace for /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi /local/core/rms/291775/core.mdrun_mpi.sc89.0
> Welcome to the Ladebug Debugger Version 67 (built Mar 10 2002 for Compaq Tru64 UNIX)
> ------------------
> object file name: /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi
> core file name: /local/core/rms/291775/core.mdrun_mpi.sc89.0
> Reading symbolic information ...done
> Core file produced from executable 'mdrun_mpi'
> Thread 8 terminated at PC 0x12010c120 by signal SEGV
> Stack trace for thread 8
> >0  0x12010c120 in angles(...) in /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi
> #1  0x12010ab70 in calc_bonds(...) in /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi
> #2  0x1200915d0 in force(...) in /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi
> #3  0x120081240 in do_force(...) in /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi
> #4  0x12007af54 in do_md(...) in /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi
> #5  0x120079b9c in mdrunner(...) in /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi
> #6  0x12007cad0 in main(...) in /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi
> #7  0x12006aed8 in __start(...) in /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi
>                                                                                 
> Stack trace for thread 7
> #0  0x3ff801374e8 in __syscall(...) in /usr/shlib/libc.so
> #1  0x300010195c0 in elan3_syscall_lwp(ctx=Info: no allocation applies for symbol ctx at the current PC
> <no value>) "syscall_dunix.c":201
> #2  0x30001007ff8 in elan3_lwp(arg=0x140022800) "elanlib.c":85
>                                                                                 
> prun: dumping elan exception state for /opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi /local/core/rms/291775/core.mdrun_mpi.sc89.0
> edb: found exception list at 4102bde0
> edb: exceptions from '/opt/gromacs-3.1.4nf2/alphaev68-dec-osf5.1/bin/mdrun_mpi'
> prun:
> 
> cheers,
> 
> Malcolm
> --
> Malcolm Gillies <Malcolm.B.Gillies at anu.edu.au>
> Postdoctoral Fellow, Computational Proteomics and Therapy Design Group,
> John Curtin School of Medical Research, Australian National University
-- 
Groeten, David.
________________________________________________________________________
Dr. David van der Spoel, 	Dept. of Cell & Mol. Biology
Husargatan 3, Box 596,  	75124 Uppsala, Sweden
phone:	46 18 471 4205		fax: 46 18 511 755
spoel at xray.bmc.uu.se	spoel at gromacs.org   http://xray.bmc.uu.se/~spoel
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




More information about the gromacs.org_gmx-users mailing list