[gmx-users] Continue run in Gromacs-4 with check point file
xuji
xuji at home.ipe.ac.cn
Fri Mar 20 03:26:09 CET 2009
Hi all:
I wrote an e-mail many days ago about continuing run in Gromacs-4.0 with check point file. But I can't solve this problem yet.
I run a simulation with
mpiexec -machinefile ./mf_24 -np 24 mdrun -v -append -cpt 5 -cpi dppc_md_prev.cpt -cpo dppc_md.cpt -s dppc_md.tpr -o dppc_md.trr -c dppc_md.gro -g dppc_md.log -e dppc_md.edr
in 4 nodes. But when I continue to run the simulation with
mpiexec -machinefile ./mf_24 -np 24 mdrun -v -append -cpt 5 -cpi dppc_md.cpt -cpo dppc_md_2.cpt -s dppc_md.tpr -o dppc_md.trr -c dppc_md.gro -g dppc_md.log -e dppc_md.edr
or with
mpiexec -machinefile ./mf_24 -np 24 mdrun -v -append -cpt 5 -cpi dppc_md_prev.cpt -cpo dppc_md_2.cpt -s dppc_md.tpr -o dppc_md.trr -c dppc_md.gro -g dppc_md.log -e dppc_md.edr
because there're 2 check point file in the simulation directory, I tried both of them.
I always get following errors:
Reading checkpoint file dppc_md_prev.cpt generated: Fri Mar 20 08:53:47 2009
or
Reading checkpoint file dppc_md.cpt generated: Fri Mar 20 08:58:08 2009
Loaded with Money
Fatal error in MPI_Bcast:
Message truncated, error stack:
MPI_Bcast(1145)...................: MPI_Bcast(buf=0x7fffc33242dc, count=4, MPI_BYTE, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast(229)...................:
MPIDI_CH3U_Receive_data_found(254): Message from rank 0 and tag 2 truncated; 12 bytes received but buffer size is 4
Fatal error in MPI_Bcast:
Message truncated, error stack:
MPI_Bcast(1145)...................: MPI_Bcast(buf=0x7fff6c0da09c, count=4, MPI_BYTE, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast(229)...................:
MPIDI_CH3U_Receive_data_found(254): Message from rank 4 and tag 2 truncated; 12 bytes received but buffer size is 4
Fatal error in MPI_Bcast:
Message truncated, error stack:
MPI_Bcast(1145)...................: MPI_Bcast(buf=0x7fff9ac2ebec, count=4, MPI_BYTE, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast(229)...................:
MPIDI_CH3U_Receive_data_found(254): Message from rank 0 and tag 2 truncated; 12 bytes received but buffer size is 4
rank 16 in job 5 Node115_33001 caused collective abort of all ranks
exit status of rank 16: killed by signal 9
rank 8 in job 5 Node115_33001 caused collective abort of all ranks
exit status of rank 8: killed by signal 9
rank 6 in job 5 Node115_33001 caused collective abort of all ranks
exit status of rank 6: killed by signal 9
Can someone help me with this problem? Appreciate any help in advance!
Best
wishes!
2009-03-20
xuji
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20090320/57163ae5/attachment.html>
More information about the gromacs.org_gmx-users
mailing list