[gmx-users] Gromacs job locks up computer (reproducibly)

Justin MacCallum jlmaccal at ucalgary.ca
Thu Aug 28 22:25:01 CEST 2003


Hi,

very interested to hear this. We have had nothing but problems with our
dual athlon 2200 MP's. They typically freeze once every week or two
while running gromacs. I'm not sure if they freeze at a reproducible
time step or not, but they are completely unresponsive and require a
reboot. We've improved the cooling, added more fans, and even moved them
to another machine room. lm_sensors reports temperatures in the low to
mid 50's (celsius) at full load. They also all use registered DDR memory
like they're supposed to. There also seems to be no pattern to which
machines are failing. They all seem to freeze at roughly random
intervals, and almost every machine (out of 16) has done it.

In short, I've been getting very frustrated with these machines, but I'm
happy to hear that there may actually be a fix. Any idea when the
environment variable check will appear? I'm going to recompile with the
suggested hacks to the cpu checks and I'll report back later.

Justin

> If it's an amd,
> 
> 1. Check if you can repeat it when compiling the standard gromacs 
> source (necessary for me to debug it, sorry :-)
> 2. cat /proc/cpuinfo and send it to me.
> 3. Try to find a file that crashes as soon as possible (30 minutes is 
> ok, 24 hours bad)
> 4. Describe it well as you can.
> 
> 
> Some background:
> 
> This is almost certainly a hardware problem in the AMD SSE 
> implementation. That is probably possible to work around, but the 
> results seem to change between different generations of the Athlon 
> CPUs, so it has been almost impossible for me to debug.
> 
> A fix (which will probably be an environment variable soon) is to edit 
> src/gmxlib/detectcpu.c and disable the SSE checking on your Athlon, 
> That way it will always use 3DNow, which is about 10% slower but rock 
> stable.
> 
> I'm sorry for the problems, but the Athlons and Pentiums are executing 
> *identical* code, and we've never seen a problem on the latter!
> 
> Cheers,
> 
> Erik

-- 
Justin MacCallum
jlmaccal at ucalgary.ca

PhD Student
Department of Biological Sciences
University of Calgary




More information about the gromacs.org_gmx-users mailing list