[gmx-users] More Athlon talk

Alan Wilter Sousa da Silva alan at biof.ufrj.br
Fri Feb 21 14:42:11 CET 2003


Hi Lynne,

	I observed this kind of problem with settle and from my experience
I can say that it usually happens when I have many jobs running in the
cluster, using almost whole memory available (swap too).

	I also suspect, since GMX uses dynamic memory, that this crashes
happen when OS (using SMP or mosix) has to move a job from one processor
to another (in the same box or over a cluster).

	If I run jobs compiled with F77 (fix memory, like Charmm) I've
never ever see a crash.  Simply, if do not have available memory, such
programs don't run, otherwise, with GMX I'll certainly have Segmentation
Faults.

	I wonder if someone else had seen this 'settle problem' in another
system different from x86.

	Perhaps, it's about time that I put GMX running on our SGI
Origin2000 and see what happens.

	Although I've never done a systematic test putting as many jobs as
possible running GMX to reproduce with certainty the 'settle problem' I
would suggest it as a stressing test.

Cheers,

On Fri, 21 Feb 2003, Lynne E. Bilston wrote:

> I haven't had much time to test single cpu jobs, so I can't answer this one
> other than to say I haven't had a problem with them. I also have ECC
> memory. However, I do still have occasional crashes in gromacs which show
> up as settle errors in the logs. There are no odd things happening as far
> as I can tell from the simulation data. If I restart from just before the
> crash, they don't occur again. This is often several nanoseconds into a
> run. If I look at the coordinate files from the step before the crash, some
> of the bits of the protein are missing. I'm not sure if any of this is
> helpful or relevant.

-----------------------
Alan Wilter S. da Silva
-----------------------
 Laboratório de Física Biológica
  Instituto de Biofísica Carlos Chagas Filho
   Universidade do Brasil/UFRJ
    Rio de Janeiro, Brasil





More information about the gromacs.org_gmx-users mailing list