[gmx-users] Re: ci barely out of bounds
chris.neale at utoronto.ca
Mon May 28 23:37:04 CEST 2007
Keeping this type of discussion on the list assists other when they
later encounter a similar problem. This email you have sent to me is
similar to what you have posted on the list as a reply but not the same.
Besides, it's double work for me to read both.
Yes, your ci is *barely* out of bounds.
The solution that I propose is the only solution that I have heard of.
There are 4 steps that I would suggest.
1. Try again without position restraints as was already suggested.
2. Try with a version of mdrun compiled using -DEBUG_PBC
3. Try in serial.
4. Try again after removing a small fraction of your water and
Matteo Guglielmi wrote:
>> ...somebody on this user list assisted me in determining that I should
>> just roll back my gcc verison.
> I did recompile everything with gcc 3.4.6 (I was using the intel 9.1
> series before) but it
> happened again right now!
>> Matteo, I want to be sure that we are on the same page here: your ci
>> is just *barely* out of bounds right?
> Yes! The ci variable is *barely* out of bounds!
> Right now I got (gromacs compiled with gcc 3.4.6):
> step 282180, will finish at Tue May 29 15:57:31 2007
> step 282190, will finish at Tue May 29 15:57:32 2007
> Program mdrun_mpi, VERSION 3.3.1
> Source code file: nsgrid.c, line: 226
> Range checking error:
> Explanation: During neighborsearching, we assign each particle to a grid
> based on its coordinates. If your system contains collisions or parameter
> errors that give particles very high velocities you might end up with some
> coordinates being +-Infinity or NaN (not-a-number). Obviously, we cannot
> put these on a grid, so this is usually where we detect those errors.
> Make sure your system is properly energy-minimized and that the potential
> energy seems reasonable before trying again.
> Variable ci has value 1472. It should have been within [ 0 .. 1440 ]
> Please report this to the mailing list (gmx-users at gromacs.org)
> "BioBeat is Not Available In Regular Shops" (P.J. Meulenhoff)
> Error on node 0, will try to stop all the nodes
> Halting parallel program mdrun_mpi on CPU 0 out of 4
> gcq#161: "BioBeat is Not Available In Regular Shops" (P.J. Meulenhoff)
> *Isn't it barely out of bounds?*
> The Grid size moved from 10 x 12 x 13 cells to 10 x 12 x 12 cells... I
> can see
> it from the md0.log file.
>> There is one test and one workaround included in my bugzilla post. The
>> test is to recompile gromacs with the -DEBUG_PBC flag and see if the
>> problem still occurs. For me this solved the problem (although gromacs
>> runs much slower so it is not a great workaround). The solution was to
>> remake my system with a few more or a few less waters so that the
>> number of grids wasn't changing as the volume of the box fluctuates
>> (slightly) during constant pressure simulations.
> Is it the only one solution at the moment?
>> I here include the text that I added to that bugzilla post:
>> Did you try with a version of mdrun that was compiled with -DEBUG_PBC ?
>> I have some runs that reliably (but stochastically) give errors about
>> an atom
>> being found in a grid just one block outside of the expected boundary
>> only in
>> parallel runs, and often other nodes have log files that indicate that
>> they have
>> just updated the grid size (constant pressure simulation). This error
>> when I run with a -DEBUG_PBC version. My assumption here is that there
>> is some
>> non-blocking MPI communication that is not getting through in time. The
>> -DEBUG_PBC version spends a lot of time checking some things and
>> although it
>> never reports having found some problem, I assume that a side-effect
>> of these
>> extra calculations is to slow things down enough at the proper stage
>> so that the
>> MPI message gets through. I have solved my problem by adjusting my
>> cell so that it doesn't fall close to the grid boundaries. Perhaps you
>> experiencing some analogous problem?
> Thanks Chris!
More information about the gromacs.org_gmx-users