[gmx-users] Hard lock up

David spoel at xray.bmc.uu.se
Fri Oct 8 21:56:23 CEST 2004

On Fri, 2004-10-08 at 21:37, Bill (William) Triest wrote:
> On Fri, 2004-10-08 at 15:17, David wrote:
> > On Fri, 2004-10-08 at 21:06, Bill (William) Triest wrote:
> > > I'm an undergrad student worker, and running gromacs under linux is
> > > locking up one of our systems.  Version 3.1 used to run fine, until 3.2
> > > was installed.  3.2 started locking up the system (and I mean LOCKING it
> > > up, you can ssh into the box, you can't ctrl-c to kill it) etc.  They
> > > tried reverting to 3.1, but its still causing problems.  It only happens
> > > on large jobs, but we have a nearly identical box (running 3.1) that can
> > > run the jobs fine.  Since the lockups only happen while running gromacs,
> > > and the machine does see some other loads (vmware and custom written
> > > software), I think its related to gromacs.  The box is currently running
> > > red hat 9, and is an smp machine (and yes I did try the mapi version,
> > > and I did ensure that the installed version of lam was as the same major
> > > version).  I tried googling for the problem, so I'm just hoping for
> > > pointers as to where to start RTFMing.
> > Does this happen to be an Athlon box?
> > In that case you may want to upgrade to 3.2.1 in which a workaround for
> > a bug in the Athlon was introduced. On the other hand, the bug was in
> > 3.1 also.
> Yes its an athlon box, but I double checked and we are attempting to run
> 3.2.1  (sorry about not lisitng the .1, I wasn't aware of it at the
> time)  The program runs fine on a single cpu athlon box w/ only a gig of
> memory, but it crashes on a dual processor athlon mp box w/ 2 gigs of
> memory.
How about bios settings? Maybe you need a bios upgrade? Or the MP
settings in your bios? Is your user running single or dual processor
> > 
> > Otherwise gromacs stresses the CPU really hard. Could it be heating
> > problems? Do you have temperature sensors on the chips? Could be a
> > broken fan or a rotten memory chip too...
> We did have a bad memory module last spring (it was still under warrenty
> and got replaced) and the first thing I did when I heard the box started
> hard hard-locking up again was run memtest86 on it.  As for over eating,
> it runs fine EXCEPT when gromacs runs.  Since they run custom written
> apps that take over a week to run and that I know stress the CPU, I'm
> guessing its not that.  (Though I'm going to double check, just in case)
There is a CPU stress test program on our website somewhere (can't find
it now). It runs quite a few degrees warmer than the AMD testing

David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596,  	75124 Uppsala, Sweden
phone:	46 18 471 4205		fax: 46 18 511 755
spoel at xray.bmc.uu.se	spoel at gromacs.org   http://xray.bmc.uu.se/~spoel

More information about the gromacs.org_gmx-users mailing list