[gmx-users] Multi-node Replica Exchange Segfault

Barnett, James W jbarnet4 at tulane.edu
Fri Oct 30 14:08:29 CET 2015


Hey Mark,

On Fri, 2015-10-30 at 08:14 +0000, Mark Abraham wrote:
> Hi,
> 
> I've never heard of such. You could try a multisim without -replex, to help
> diagnose.


A multidir simulation runs without issue when -replex is omitted.

> 
> On Fri, 30 Oct 2015 03:33 Barnett, James W <jbarnet4 at tulane.edu> wrote:
> 
> > Good evening here,
> > 
> > I get a segmentation fault with my GROMACS 5.1 install only for replica
> > exchange
> > simulations right at the first successful exchange on a multi-node run.
> > Normal
> > simulations across multiple nodes work fine, and replica exchange
> > simulations on
> > one node work fine.
> > 
> > I've reproduced the problem with just 2 replicas on 2 nodes with GPU's
> > disabled
> > (-nb cpu). Each node has 20 CPU's so I'm using 20 MPI ranks on each
> > (OpenMPI).
> > 
> > I get a segfault right when the first exchange is successful.
> > 
> > The only other error I get sometimes is that the Infiniband connection
> > timed out
> > retrying the communication between nodes at the exact same moment as the
> > segfault, but I don't get that every time, and it's usually with all
> > replicas
> > going (my goal is to do 30 replicas on 120 cpus). No other error logs, and
> > mdrun's log does not indicate an error.
> > 
> > PBS log: http://bit.ly/1P8Vs49
> > mdrun log: http://bit.ly/1RD0ViQ
> > 
> > I'm currently troubleshooting this some with the sysadmin, but I wanted to
> > check
> > to see if anyone has had a similar issue or any further steps to
> > troubleshoot.
> > I've also searched the mailing list and used my Google-fu, but it has
> > failed me
> > so far.
> > 
> > Thanks for your help.
> > 

-- 
James "Wes" Barnett, Ph.D. Candidate
Louisiana Board of Regents Fellow

Chemical and Biomolecular Engineering
Tulane University
341-B Lindy Boggs Center for Energy and Biotechnology
6823 St. Charles Ave
New Orleans, Louisiana 70118-5674
jbarnet4 at tulane.edu


More information about the gromacs.org_gmx-users mailing list