[gmx-developers] mdrun_mpi not able to reach "rank"

1004753465 1004753465 at qq.com
Wed Jun 5 07:54:42 CEST 2019


Hi everyone,


I am currently trying to run two Gromacs 2018 parallel processes by using


mpirun -np 2 ...(some path)/mdrun_mpi -v -multidir sim[01]


During the simulation, I need to collect some information to the two master nodes, just like the function "dd_gather". Therefore, I need to reach (cr->dd) for each rank. However, whenever I want to print "cr->dd->rank" or "cr->dd->nnodes"or some thing like that, it just shows


[c15:31936] *** Process received signal ***
[c15:31936] Signal: Segmentation fault (11)
[c15:31936] Signal code: Address not mapped (1)
[c15:31936] Failing at address: 0x30
[c15:31936] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f7f9e374340]
[c15:31936] [ 1] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x468cfb]
[c15:31936] [ 2] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x40dd65]
[c15:31936] [ 3] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x42ca93]
[c15:31936] [ 4] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x416f7d]
[c15:31936] [ 5] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x41792c]
[c15:31936] [ 6] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x438756]
[c15:31936] [ 7] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x438b3e]
[c15:31936] [ 8] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x439a97]
[c15:31936] [ 9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f7f9d591ec5]
[c15:31936] [10] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x40b93e]
[c15:31936] *** End of error message ***
step 0[c15:31935] *** Process received signal ***
[c15:31935] Signal: Segmentation fault (11)
[c15:31935] Signal code: Address not mapped (1)
[c15:31935] Failing at address: 0x30
[c15:31935] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7fb64892e340]
[c15:31935] [ 1] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x468cfb]
[c15:31935] [ 2] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x40dd65]
[c15:31935] [ 3] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x42ca93]
[c15:31935] [ 4] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x416f7d]
[c15:31935] [ 5] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x41792c]
[c15:31935] [ 6] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x438756]
[c15:31935] [ 7] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x438b3e]
[c15:31935] [ 8] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x439a97]
[c15:31935] [ 9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fb647b4bec5]
[c15:31935] [10] /home/hudan/wow/ngromacs-2018/gromacs-2018/build/bin/mdrun_mpi() [0x40b93e]
[c15:31935] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 31935 on node c15.dynstar exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------



However, if I install the package without flag -DGMX_MPI=on, the single program(mdrun) runs smoothly. and all the domain decomposition rank can be printed out and used conveniently.


It is pretty wierd to me that, with mdrun_mpi, although domain decomposition can be done, their rank can neither be printed out nor available through struct cr->dd. I wonder whether they were saved in other form, but I do not know what it is.


I will appreciate it if someone can help. Thank you very much!!!
Best,
Huan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20190605/b8f64ba4/attachment.html>


More information about the gromacs.org_gmx-developers mailing list