Re: [gmx-users] “Fatal error in PMPI_Bcast: Other MPI error, …..” occurs when using the ‘particle decomposition’ option.
xhomes at sohu.com
xhomes at sohu.com
Tue Jun 1 13:57:25 CEST 2010
Hi, Mark,<o:p></o:p>
Thanks for the
reply! <o:p></o:p>
It seemed that I
got something messed up. At the beginning, I used ‘constraints = all-bonds’ and
‘domain decomposition’.
When the simulation scale to more than 2 processes, an error
like below will occur: <o:p></o:p>
####################<o:p></o:p>
Fatal error: There
is no domain decomposition for 6 nodes that is compatible with the given box
and a minimum cell size of 2.06375 nm<o:p></o:p>
Change the number
of nodes or mdrun option -rcon or -dds or your LINCS settings<o:p></o:p>
Look in the log
file for details on the domain decomposition<o:p></o:p>
####################<o:p></o:p>
<o:p> </o:p>
I refer to the
manual and found no answer. Then I turned to use ‘particle decomposition’, tried
all kind of method, including change mpich to lammpi, change Gromacs from V4.05
to V4.07,adjusting the mdp file (e.g. ‘constraints = hbonds’ or no PME), and none of these
take effect! I thought I have tried ‘constraints = hbonds’ with ‘domain decomposition’, at least with lammpi. <o:p></o:p>
However, when I tried ‘constraints
= hbonds’
and ‘domain decomposition’ under
mpich today, it scaled to more than 2 processes well! And now it also scaled
well under lammpi using ‘constraints
= hbonds’
and ‘domain decomposition’!<o:p></o:p>
<o:p> </o:p>
So, it seemed the key is ‘constraints
= hbonds’
for ‘domain decomposition’.<o:p></o:p>
<o:p> </o:p>
Of course, the simulation still crashed when using ‘particle decomposition’ with ‘constraints = hbonds or all-bonds’, and I
don’t know why.<o:p></o:p>
<o:p> </o:p>
I use double precision version and NTP ensemble to perform a PCA!<o:p></o:p> ----- Original Message -----From: xhomes at sohu.comDate: Tuesday, June 1, 2010 11:53Subject: [gmx-users] “Fatal error in PMPI_Bcast: Other MPI error, …..” occurs when using the ‘particle decomposition’ option.To: gmx-users <gmx-users at gromacs.org>> Hi, everyone of gmx-users,> > I met a problem when I use the ‘particle decomposition’ option > in a NTP MD simulation of Engrailed Homeodomain (En) in CL- > neutralized water box. It just crashed with an error “Fatal > error in PMPI_Bcast: Other MPI error, error stack: …..”. > However, I’ve tried the ‘domain decomposition’ and everything is > ok! I use the Gromacs 4.05 and 4.07, the MPI lib is mpich2-> 1.2.1p1. The system box size is 5.386(nm)3. The MDP file list as > below:> ########################################################> title = En> ;cpp = /lib/cpp> ;include = -I../top> define = > integrator = md> dt = 0.002> nsteps = 3000000> nstxout = 500> nstvout = 500> nstlog = 250> nstenergy = 250> nstxtcout = 500> comm-> mode = Linear> nstcomm = 1> > ;xtc_grps = Protein> energygrps = protein non-protein> > nstlist = 10> ns_type = grid> pbc = xyz ;default xyz> ;periodic_molecules = > yes ;default no> rlist = 1.0> > coulombtype = PME> rcoulomb = 1.0> vdwtype = Cut-off> rvdw = 1.4> fourierspacing = 0.12> fourier_nx = 0> fourier_ny = 0> fourier_nz = 0> pme_order = 4> ewald_rtol = 1e-5> optimize_fft = yes> > tcoupl = v-rescale> tc_grps = protein non-protein> tau_t = 0.1 0.1> ref_t = 298 298> Pcoupl = Parrinello-Rahman> pcoupltype = isotropic> tau_p = 0.5> compressibility = 4.5e-5> ref_p = 1.0> > gen_vel = yes> gen_temp = 298> gen_seed = 173529> > constraints = hbonds> lincs_order = 10> ########################################################> > When I conduct MD using “nohup mpiexec -np 2 mdrun_dmpi -s > 11_Trun.tpr -g 12_NTPmd.log -o 12_NTPmd.trr -c 12_NTPmd.pdb -e > 12_NTPmd_ener.edr -cpo 12_NTPstate.cpt &”, everything is OK.> > Since the system doesn’t support more than 2 processes under > ‘domain decomposition’ option, it took me about 30 days to > calculate a 6ns trajectory. Then I decide to use the ‘particle Why no more than 2? What GROMACS version? Why are you using double precision with temperature coupling?MPICH has known issues. Use OpenMPI.> decomposition’ option. The command line is “nohup mpiexec -np 6 > mdrun_dmpi -pd -s 11_Trun.tpr -g 12_NTPmd.log -o 12_NTPmd.trr -c > 12_NTPmd.pdb -e 12_NTPmd_ener.edr -cpo 12_NTPstate.cpt &”. And I > got the crash in the nohup file like below:> ####################> Fatal error in PMPI_Bcast: Other MPI error, error stack:> PMPI_Bcast(1302)......................: MPI_Bcast(buf=0x8fedeb0, > count=60720, MPI_BYTE, root=0, MPI_COMM_WORLD) failed> MPIR_Bcast(998).......................: > MPIR_Bcast_scatter_ring_allgather(842): > MPIR_Bcast_binomial(187)..............: > MPIC_Send(41).........................: > MPIC_Wait(513)........................: > MPIDI_CH3I_Progress(150)..............: > MPID_nem_mpich2_blocking_recv(948)....: > MPID_nem_tcp_connpoll(1720)...........: > state_commrdy_handler(1561)...........: > MPID_nem_tcp_send_queued(127).........: writev to socket failed -> Bad address> rank 0 in job 25 cluster.cn_52655 caused > collective abort of all ranks> exit status of rank 0: killed by signal 9> ####################> > And the ends of the log file list as below:> ####################> ……..> ……..> ……..> ……..> > bQMMM = FALSE> > QMconstraints = 0> QMMMscheme = 0> > scalefactor = 1> qm_opts:> > ngQM = 0> ####################> > I’ve search the gmx-users mail list and tried to adjust the md > parameters, and no solution was found. The "mpiexec -np x" > option doesn't work except when x=1. I did found that when the > whole En protein is constrained using position restraints > (define = -DPOSRES), the ‘particle decomposition’ option works. > However this is not the kind of MD I want to conduct.> > Could anyone help me about this problem? And I also want to know > how can I accelerate this kind of MD (long time simulation of > small system) using Gromacs? Thinks a lot!> > (Further information about the simulated system: The system has > one En protein (54 residues, 629 atoms), total 4848 spce waters, > and 7 Cl- used to neutralize the system. The system has been > minimized first. A 20ps MD is also performed for the waters and > ions before EM.)This should be bread-and-butter with either decomposition up to at least 16 processors, for a correctly compiled GROMACS with a useful MPI library.Mark--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20100601/0aac676e/attachment.html>
More information about the gromacs.org_gmx-users
mailing list