[gmx-users] Multi-node Replica Exchange Segfault
jkrieger at mrc-lmb.cam.ac.uk
jkrieger at mrc-lmb.cam.ac.uk
Fri Oct 30 09:59:10 CET 2015
You could try using a mixture of openmpi and thread-mpi. I have found when
linked replicas in multi-walker metadynamics that it only works if the
replicas have the same allocation of the cluster. In Sun Grid Engine, I'd
have the following in my submit scripts:
#$ -pe openmpi 2
#$ -l dedicated=20
mpirun -np $NSLOTS mdrun_mpi -deffnm metad -cpi metad -plumed plumed.dat
It's probably slightly different with PBS but you could try the equivalent
without the plumed and then with replex.
> I've never heard of such. You could try a multisim without -replex, to
> On Fri, 30 Oct 2015 03:33 Barnett, James W <jbarnet4 at tulane.edu> wrote:
>> Good evening here,
>> I get a segmentation fault with my GROMACS 5.1 install only for replica
>> simulations right at the first successful exchange on a multi-node run.
>> simulations across multiple nodes work fine, and replica exchange
>> simulations on
>> one node work fine.
>> I've reproduced the problem with just 2 replicas on 2 nodes with GPU's
>> (-nb cpu). Each node has 20 CPU's so I'm using 20 MPI ranks on each
>> I get a segfault right when the first exchange is successful.
>> The only other error I get sometimes is that the Infiniband connection
>> timed out
>> retrying the communication between nodes at the exact same moment as the
>> segfault, but I don't get that every time, and it's usually with all
>> going (my goal is to do 30 replicas on 120 cpus). No other error logs,
>> mdrun's log does not indicate an error.
>> PBS log: http://bit.ly/1P8Vs49
>> mdrun log: http://bit.ly/1RD0ViQ
>> I'm currently troubleshooting this some with the sysadmin, but I wanted
>> to see if anyone has had a similar issue or any further steps to
>> I've also searched the mailing list and used my Google-fu, but it has
>> failed me
>> so far.
>> Thanks for your help.
>> James "Wes" Barnett, Ph.D. Candidate
>> Louisiana Board of Regents Fellow
>> Chemical and Biomolecular Engineering
>> Tulane University
>> 341-B Lindy Boggs Center for Energy and Biotechnology
>> 6823 St. Charles Ave
>> New Orleans, Louisiana 70118-5674
>> jbarnet4 at tulane.edu
>> Gromacs Users mailing list
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
> Gromacs Users mailing list
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send
> a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users