[gmx-users] Hard lock up
Bill (William) Triest
wtriest at chemistry.ohio-state.edu
Fri Oct 8 21:37:41 CEST 2004
On Fri, 2004-10-08 at 15:17, David wrote:
> On Fri, 2004-10-08 at 21:06, Bill (William) Triest wrote:
> > I'm an undergrad student worker, and running gromacs under linux is
> > locking up one of our systems. Version 3.1 used to run fine, until 3.2
> > was installed. 3.2 started locking up the system (and I mean LOCKING it
> > up, you can ssh into the box, you can't ctrl-c to kill it) etc. They
> > tried reverting to 3.1, but its still causing problems. It only happens
> > on large jobs, but we have a nearly identical box (running 3.1) that can
> > run the jobs fine. Since the lockups only happen while running gromacs,
> > and the machine does see some other loads (vmware and custom written
> > software), I think its related to gromacs. The box is currently running
> > red hat 9, and is an smp machine (and yes I did try the mapi version,
> > and I did ensure that the installed version of lam was as the same major
> > version). I tried googling for the problem, so I'm just hoping for
> > pointers as to where to start RTFMing.
> Does this happen to be an Athlon box?
> In that case you may want to upgrade to 3.2.1 in which a workaround for
> a bug in the Athlon was introduced. On the other hand, the bug was in
> 3.1 also.
Yes its an athlon box, but I double checked and we are attempting to run
3.2.1 (sorry about not lisitng the .1, I wasn't aware of it at the
time) The program runs fine on a single cpu athlon box w/ only a gig of
memory, but it crashes on a dual processor athlon mp box w/ 2 gigs of
memory.
>
> Otherwise gromacs stresses the CPU really hard. Could it be heating
> problems? Do you have temperature sensors on the chips? Could be a
> broken fan or a rotten memory chip too...
We did have a bad memory module last spring (it was still under warrenty
and got replaced) and the first thing I did when I heard the box started
hard hard-locking up again was run memtest86 on it. As for over eating,
it runs fine EXCEPT when gromacs runs. Since they run custom written
apps that take over a week to run and that I know stress the CPU, I'm
guessing its not that. (Though I'm going to double check, just in case)
>
> If you are "sure" the hardware is not too blame, give us some more info
> on the kind of jobs that crash the machine.
You know I have no idea. I know from where he's running gromacs, and I
can see the log files, but that's about it. I'm an undergrad CS major
who works for computer support. If you can tell me what I should look
for I will happily get you the info.
Thanks for the quick reply, and sorry becuase I feel like I'm not being
helpful here.
Thanks,
William Triest
Student Worker - Linux
More information about the gromacs.org_gmx-users
mailing list