[gmx-users] mdrun_mpi issue with CHARMM36 FF

Mark Abraham Mark.Abraham at anu.edu.au
Mon May 14 08:05:26 CEST 2012


On 14/05/2012 3:52 PM, Anirban wrote:
> Hi ALL,
>
> I am trying to simulate a membrane protein system using CHARMM36 FF on 
> GROAMCS4.5.5 on a parallel cluster running on MPI. The system consists 
> of arounf 1,17,000 atoms. The job runs fine on 5 nodes (5X12=120 
> cores) using mpirun and gives proper output. But whenever I try to 
> submit it on more than 5 nodes, the job gets killed with the following 
> error:

That's likely going to be an issue with the configuration of your MPI 
system, or your hardware, or both. Do check your .log file for evidence 
of unsuitable DD partiion, though the fact of "turning on dynamic load 
balancing" suggest DD partitioning worked OK.

Mark

>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> starting mdrun 'Protein'
> 50000000 steps, 100000.0 ps.
>
> NOTE: Turning on dynamic load balancing
>
> Fatal error in MPI_Sendrecv: Other MPI error
> Fatal error in MPI_Sendrecv: Other MPI error
> Fatal error in MPI_Sendrecv: Other MPI error
>
> =====================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   EXIT CODE: 256
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> =====================================================================================
> [proxy:0:0 at cn034] HYD_pmcd_pmip_control_cmd_cb 
> (./pm/pmiserv/pmip_cb.c:906): assert (!closed) failed
> [proxy:0:0 at cn034] HYDT_dmxu_poll_wait_for_event 
> (./tools/demux/demux_poll.c:77): callback returned error status
> [proxy:0:0 at cn034] main (./pm/pmiserv/pmip.c:214): demux engine error 
> waiting for event
> .
> .
> .
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Why is this happening? Is it related to DD and PME? How to solve it? 
> Any suggestion is welcome.
> Sorry for re-posting.
>
>
> Thanks and regards,
>
> Anirban
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20120514/a94d824c/attachment.html>


More information about the gromacs.org_gmx-users mailing list