[gmx-users] Replica Exchange MD on more than 64 processors
gmx3 at hotmail.com
Tue Feb 2 19:11:57 CET 2010
One issue could be MPI memory usage.
I have noticed that many MPI implementations use an amount of memory
per process that is quadratic (!) in the number of processes involved.
This can quickly get out of hand. But 28 GB is a lot of memory.
One thing that might help slightly is to not use double precision,
which is almost never required. This will also make your simulations
a factor 1.4 faster.
> Date: Tue, 2 Feb 2010 18:55:37 +0100
> From: breuerss at uni-koeln.de
> To: gmx-users at gromacs.org
> Subject: [gmx-users] Replica Exchange MD on more than 64 processors
> Dear list
> I recently came up with a problem concerning a replica exchange simulation. The simulation is run
> with gromacs-mpi in Version 4.0.7 compiled with following flags
> --enable-threads --enable-mpi --with-fft=mkl -enable-double,
> intel compiler version 11.0
> mvapich version 1.1.0
> mkl version 10.1
> The program is working fine in this cluster evironment consisting of 32 nodes with 8 processors and
> 32GB each. I've already run several simulations using the MPI feature.
> It seems that I stuck in a similar problem that was already announced on this list by bharat v.
> adkar in december 2009 without an eventual solution:
> I am doing a replica exchange simulation on a simulation box with 5000 molecules (81 atoms each) and
> 4 different temperatures. The simulation runs nicely with 64 processors (8 nodes) but stops with an
> error message on 128 processors (16 nodes).
> Taking the following four points into account:
> 1. every cluster node has at least 28GB memory in a usable way available
> 2. the system I am working with should only use
> 5000*81*900B=347.614MB (according to the FAQ)
> 3. even if every replica (4) is run on the same node the memory usage
> should be less than 2GB
> 4. the simulation works fine with 64 processors
> it seems to me the following error
> Program mdrun, VERSION 4.0.7
> Source code file: smalloc.c, line: 179
> Fatal error:
> Not enough memory. Failed to realloc 790760 bytes for nlist->jjnr, nlist->jjnr=0xae70b7b0
> (called from file ns.c, line 503)
> has to be caused by another issue than missing memory.
> I am wondering if there is anyone else who is still facing the same problem or has already found a
> solution for this issue.
> Kind regards
> Sebastian Breuers Tel: +49-221-470-4108
> EMail: breuerss at uni-koeln.de
> Universität zu Köln University of Cologne
> Department für Chemie Department Chemistry
> Organische Chemie university of Cologne
> Greinstr. 4 Greinstr. 4
> D-50939 Köln D-50939 Cologne, Federal Rep. of Germany
> gmx-users mailing list gmx-users at gromacs.org
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
New Windows 7: Simplify what you do everyday. Find the right PC for you.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the gromacs.org_gmx-users