[gmx-users] MPI_BCAST : Message truncated [when restarting from checkpoint]
darth.vasya at gmail.com
Thu Nov 5 06:27:22 CET 2009
I encountered the following error messages when trying out the checkpoint
continuation functionality with MPI:
Reading file topol.tpr, VERSION 4.0.5 (single precision)
> Reading checkpoint file state.cpt generated: Tue Oct 27 22:37:23 2009
> 8 - MPI_BCAST : Message truncated
>   Aborting Program!
> Abort signaled by rank 8: Aborting program !
> Exit code -3 signaled from [node address]
> Killing remote processes...
>From a short google search I've come under the impression that this has to
do with sending messages between the nodes that are larger than the receive
buffer (note my total lack of practical experience with MPI programming).
However, I have no idea where, if this is true, it might be happening.
My system has 44091 atoms, and the .cpt file size is 1.1M, which doesn't
seems too large. Restarting works seemingly fine with serial mdrun.
Thank you in advance for any clues on this,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the gromacs.org_gmx-users