[gmx-users] multiple processor running of gromacs-4.0.4

Jianhui Tian jianhuitian at gmail.com
Tue Oct 20 16:49:02 CEST 2009


Hi Mark,

You are right that treating electrostatics and vdw without cutoff and use
multiple processors to speed up doesn't make any sense.
Using cutoff for electrostatics and vdw without using pbc and pme, the
system can't run on parallel (using flag -pd doesn't work either), but works
on single node.
Using cutoff for electrostatics and vdw with pbc and pme, the system can run
on parallel.
Thanks for the clarifications.

Jianhui

Jianhui Tian wrote:
> Hi,
>
> I have a small system of fullerene ball with waters. I want to
> simulation the system without pbc and pme, treating the coulomb and vdw
> without cutoff. The system can be run on single processor. But when I
> try to run it with multiple processors, it can't proceed. I am including
> the error message at the end of this mail.

Those messages are generic failure messages from the MPI system. The
stack trace therein suggests you should read the stderr and/or log file
to get some diagnostic information from GROMACS. Possibly your system is
blowing up because you've started from an unsuitable configuration.

> There are some of my guesses for the possible reason of the error:
> 1. Do we have to use pbc and pme to use multiple processors for
simulation?

Wouldn't think so. However because you have no cut-offs, there will be
no advantage to DD because there is no data locality - every processor
needs the position of every atom. mdrun -pd may work. It strikes me as
possible that this scenario doesn't work in parallel on GROMACS.

> 2. Can wen use restraints when use multiple processors?

Yes.

> 3. If a molecule is "divided" into two parts in domain decomposition,
> will this be a problem for simulation?

No, that's routine.

Mark

> Thanks for any suggestion about this error message.
>
> logfile:----------------------
----------------------------------------------------
> [0,1,1]: OpenIB on host compute-1-12.local was unable to find any HCAs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [0,1,0]: OpenIB on host compute-1-12.local was unable to find any HCAs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> NNODES=2, MYRANK=0, HOSTNAME=compute-1-12.local
> NODEID=0 argc=13
> NNODES=2, MYRANK=1, HOSTNAME=compute-1-12.local
> NODEID=1 argc=13
> .
> .
> .
> .
> Reading file
> /data/disk04/tianj/trp_remd_091012/equilibrium/test/trp_full_run1.tpr,
> VERSION 4.0 (single precision)
> [compute-1-12:08033] *** Process received signal ***
> [compute-1-12:08033] Signal: Segmentation fault (11)
> [compute-1-12:08033] Signal code: Address not mapped (1)
> [compute-1-12:08033] Failing at address: 0xc0
> [compute-1-12:08033] [ 0] /lib64/libpthread.so.0 [0x378fe0de80]
> [compute-1-12:08033] [ 1] /lib64/libc.so.6(_IO_vfprintf+0x39)
[0x378f642309]
> [compute-1-12:08033] [ 2] /lib64/libc.so.6(_IO_fprintf+0x88)
[0x378f64cf68]
> [compute-1-12:08033] [ 3]
> /usr/local/gromacs-4.0.4/bin/mdrun_sm(mk_mshift+0x315) [0x516593]
> [compute-1-12:08033] [ 4] /usr/local/gromacs-4.0.4/bin/mdrun_sm [0x45fa97]
> [compute-1-12:08033] [ 5]
> /usr/local/gromacs-4.0.4/bin/mdrun_sm(dd_bonded_cg_distance+0x36c)
> [0x45ff53]
> [compute-1-12:08033] [ 6]
> /usr/local/gromacs-4.0.4/bin/mdrun_sm(init_domain_decomposition+0x780)
> [0x44d10c]
> [compute-1-12:08033] [ 7]
> /usr/local/gromacs-4.0.4/bin/mdrun_sm(mdrunner+0x89c) [0x429f6e]
> [compute-1-12:08033] [ 8]
> /usr/local/gromacs-4.0.4/bin/mdrun_sm(main+0x7ba) [0x4306b6]
> [compute-1-12:08033] [ 9] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x378f61d8b4]
> [compute-1-12:08033] [10] /usr/local/gromacs-4.0.4/bin/mdrun_sm [0x4199a9]
> [compute-1-12:08033] *** End of error message ***
> mpirun noticed that job rank 0 with PID 8033 on node compute-1-12.local
> exited on signal 11 (Segmentation fault).
> 1 additional process aborted (not shown)
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20091020/a912f52f/attachment.html>


More information about the gromacs.org_gmx-users mailing list