[gmx-users] MVAPICH2 mdrun problem for SD and MD, GROMACS 4.0.5

Mark Abraham Mark.Abraham at anu.edu.au
Fri Oct 30 16:21:37 CET 2009


Daniel Adriano Silva M wrote:
> Mark,
> 
> I will test, but please tell me: do you think MPI linking problem
> could lead to problems with some dynamics and not with others as
> happens to me? and also note that all my test where made with mvapich2
> (even that with one core). Please justin, what do you think about?

For example, observing symptoms from buffer overruns can be sensitive to 
the actual calculation being run because it can depend how the actual 
memory gets laid out and used. So here, that might translate to the kind 
of system being simulated, and the number of cores used. Such an overrun 
might be present in the code all the time, or only exists after a 
linking mismatch, or similar.

Mark

> Thanks
> Daniel
> 
> 2009/10/30 Mark Abraham <Mark.Abraham at anu.edu.au>:
>> Daniel Adriano Silva M wrote:
>>> Dear Gromacs users,
>>>
>>> I am experimenting the next problem on an infiniband-cluster (8
>>> intel-cores per node, GROMACS compiled with icc 11.1, all run through
>>> mvapich2):
>>>
>>> I have a molecule (protein 498aa, solvated or in vacuum I get the same
>>> problem at any box shape), when I try to SD minimize it with
>>> mvapich2-mdrun, it minimizes well with 1, 2 or 3 cores and reaches
>>> convergence in around 1000steps, however any further combination (4,5,
>>> ...n cores) makes it to immediately stop (less than 20 steps) with:
>>>
>>> "Steepest Descents converged to machine precision...".
>>>
>>> Further if I take "the 1 core minimized structure" and try to make a
>>> solvated-pr dynamics(2fs, MD, NTP, etc.) it also works with 1
>>> processor, but with more cores it begins immediately to bring LINCS
>>> warnings: and dies:
>>>
>>> "Too many LINCS warnings" or "Water molecule starting at atom 16221
>>> can not be settled"
>>>
>>> For a "long time" I had made another md simulations on this cluster
>>> with the same mdps and other proteic systems, and I only see this
>>> behavior with this particular protein, of course  before send this
>>> mail I re-tested previuos-working tprs.
>>> Finally, the most suspicious is that I have another very similar
>>> 8-core box (with the same processors) but with gromacs gcc compiled,
>>> and it actually runs very well the same problematic molecule (even the
>>> same tpr) with mpi and 8-cores.
>>> What do you think??? Please, if you have some tpr to test something send
>>> it.
>> I'd guess you're having some problem with (dynamic) linking of the MPI
>> library. Perhaps the version of some library has changed since recently,
>> etc. I'd suggest compiling two fresh copies of GROMACS with either icc and
>> gcc on the troublesome machine and seeing what happens with them.
>>
>> Mark
>> _______________________________________________
>> gmx-users mailing list    gmx-users at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>> Please search the archive at http://www.gromacs.org/search before posting!
>> Please don't post (un)subscribe requests to the list. Use the www interface
>> or send it to gmx-users-request at gromacs.org.
>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>>
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> 



More information about the gromacs.org_gmx-users mailing list