[gmx-users] More Athlon talk

Lynne E. Bilston l.bilston at unsw.edu.au
Thu Feb 20 23:32:53 CET 2003


>
>I've tested our dual Athlons rather carefully, and we have never, ever,
>had a problem with single-cpu jobs (but we use ECC memory).
>
>The CPU test-heat-program is also running great, so I think we can
>safely rule out the possibility of a bug in Gromacs, since Pentiums
>execute exactly the same code without any problems.
>
>That leaves the SMP stuff; perhaps we can try to debug it together :-)
>
>First, is anybody having RANDOM (i.e. not deterministic) problems when
>not running in parallel? In that case we should probably have a look at
>the linux kernel mailing list and see if people are aware of it, or if
>there is a bugfix. Justin's problem sounds like it's overheating or
>something, I'm rather thinking of minor quirks like getting slightly
>different trajectories when you run a simulation twice (not complete
>crashes).
>
>If/When those things work well we can start bugging the LAM authors or
>compare LAM with MPICH.
>
>I'm sitting at Stanford right now (AMD headquarters is 20 min away), so
>if we really find out a problem that is more or less reproducible I
>could try to get a technical contact with AMD.

I haven't had much time to test single cpu jobs, so I can't answer this one 
other than to say I haven't had a problem with them. I also have ECC 
memory. However, I do still have occasional crashes in gromacs which show 
up as settle errors in the logs. There are no odd things happening as far 
as I can tell from the simulation data. If I restart from just before the 
crash, they don't occur again. This is often several nanoseconds into a 
run. If I look at the coordinate files from the step before the crash, some 
of the bits of the protein are missing. I'm not sure if any of this is 
helpful or relevant.

I ran Erik's stress-test program without any problems. I've got much lower 
temps than Justin too, so perhaps he is primarily having heat problems.

-Lynne

______________________________________________________________________________
Lynne E. Bilston, PhD
Senior Scientist, Prince of Wales Medical Research Institute, and
Conjoint Associate Professor, Faculty of Medicine, University of New South 
Wales
POWMRI, Barker St, Randwick, NSW 2031 Australia
Tel: +61-2-9382-7924	Fax: +61-2-9382-2643
http://www.powmri.edu.au
______________________________________________________________________________




More information about the gromacs.org_gmx-users mailing list