[gmx-users] gromacs on glacier

Payman Pirzadeh ppirzade at ucalgary.ca
Mon Jun 8 18:57:27 CEST 2009


Hi,

I had the chance to run the GROMACS 4.0.4 on another cluster. Same problem
still persists. But what I found is that it can be run on a node with 2
CPUs, but as soon as the number of nodes are increased to 2, 3, . it will
crash. Following are the last lines reported in different files:

"In the log file of the code":

 

There are: 1611 Atoms

There are: 1611 VSites

Charge group distribution at step 0: 101 147 137 152

Grid: 5 x 3 x 3 cells

 

"in the output file reported by cluster":

 

pwd= /home/ppirzade/GROMACS/mytests/small-box-of-water

Got 4 slots.

compute-1-34

compute-1-34

compute-2-20

compute-2-20

Starting run at: Mon Jun  8 10:27:52 MDT 2009

p2_22627:  p4_error: Timeout in establishing connection to remote process: 0

rm_l_2_22748: (301.332031) net_send: could not write to fd=5, errno = 32

p2_22627: (301.332031) net_send: could not write to fd=5, errno = 32

p0_21851: (302.351562) net_recv failed for fd = 6

p0_21851:  p4_error: net_recv read, errno = : 104

p0_21851: (306.359375) net_send: could not write to fd=4, errno = 32

 Ending run at: Mon Jun  8 10:32:59 MDT 2009

 

"in the error file reported by cluster":

 

Reading file npttest.tpr, VERSION 4.0.4 (single precision)

Making 1D domain decomposition 4 x 1 x 1

Killed by signal 2.^M

Killed by signal 2.^M

Killed by signal 2.^M

 

To me, it seems that code can not communicate through more than one node. I
am suspicious of doing sth wrong during installation! I tried wiki, but I
can not find the documents as before, and I eally do not know in which step
I might have gone wrong.

 

Payman

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20090608/1305f402/attachment.html>


More information about the gromacs.org_gmx-users mailing list