[gmx-users] shortage of shared memory

Chris Neale chris.neale at utoronto.ca
Sat Jul 14 19:33:48 CEST 2007


>So it seems that there is a problem in the shared memory communication 
>layer of openmpi that only shows up sporadically. However, if it is not 
>reproducible it could also be physical memory problems, i.e. bad DIMMS, 
>espcially sice you have data corruption every once in a while. Some 
>tests that you can do, take a big file (much larger than the amount of 
>memory you have) and run md5sum on it a few times. Copy the file to a 
>"good" machine and run it there as well. It should always give the same 
>result. If you can rule out hardware than OpenMPI could be the problem. 
>You could try the latest LAM or MPICH 2.x (not 1.x!).

Our sysadmin has run Memtest-86 v3.3 and found no problem in 4 passes. 
I will look into MPICH 2.x (we found openmpi to run 10% faster than
LAM so don't really want to go back).

Thanks for the reply,
Chris.




More information about the gromacs.org_gmx-users mailing list