[gmx-users] the mdrun choked or maybe braked.

Justin A. Lemkul jalemkul at vt.edu
Fri Aug 13 14:40:59 CEST 2010



#ZHAO LINA# wrote:
> ________________________________________
> From: gmx-users-bounces at gromacs.org [gmx-users-bounces at gromacs.org] on behalf of Justin A. Lemkul [jalemkul at vt.edu]
> Sent: Friday, August 13, 2010 7:06 PM
> To: Discussion list for GROMACS users
> Subject: Re: [gmx-users] the mdrun choked or maybe braked.
> 
> #ZHAO LINA# wrote:
>>  Hi,
>>
>> The problem just like this, in md.log
>> \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
>> There are: 130889 Atoms
>> splitting topology...
>> Walking down the molecule graph to make constraint-blocks
>> There are 43925 charge group borders and 43271 shake borders
>> There are 43271 total borders
>> Division over nodes in atoms:
>>     9348    9348    9348    9351    9348    9351    9348    9351
>> 9348    9351
>>     9348    9351    9348    9350
>> Walking down the molecule graph to make constraint-blocks
>> CPU=  0, lastcg= 3245, targetcg=25208, myshift=    8
>> CPU=  1, lastcg= 6361, targetcg=28324, myshift=    8
>> CPU=  2, lastcg= 9477, targetcg=31440, myshift=    8
>> CPU=  3, lastcg=12594, targetcg=34557, myshift=    8
>> CPU=  4, lastcg=15710, targetcg=37673, myshift=    8
>> CPU=  5, lastcg=18827, targetcg=40790, myshift=    8
>> CPU=  6, lastcg=21943, targetcg=43906, myshift=    7
>> CPU=  7, lastcg=25060, targetcg= 3098, myshift=    7
>> CPU=  8, lastcg=28176, targetcg= 6214, myshift=    7
>> CPU=  9, lastcg=31293, targetcg= 9331, myshift=    7
>> CPU= 10, lastcg=34409, targetcg=12447, myshift=    7
>> CPU= 11, lastcg=37526, targetcg=15564, myshift=    7
>> CPU= 12, lastcg=40642, targetcg=18680, myshift=    7
>> CPU= 13, lastcg=43924, targetcg=21962, myshift=    8
>> pd->shift =   8, pd->bshift=  0
>> Division of bonded forces over processors
>> CPU              0     1     2     3     4     5     6     7     8
>> 9    10
>>   11    12    13
>> G96ANGLES     2376     0     0     0     0     0     0     0     0
>> 0     0
>>    0     0     0
>> PDIHS          774     0     0     0     0     0     0     0     0
>> 0     0
>>    0     0     0
>> IDIHS          852     0     0     0     0     0     0     0     0
>> 0     0
>>    0     0     0
>> LJ14          2436     0     0     0     0     0     0     0     0
>> 0     0
>>    0     0     0
>> CONSTR        1614     0     0     0     0     0     0     0     0
>> 0     0
>>    0     0     0
>> SETTLE        2586  3116  3116  3117  3116  3117  3116  3117  3116
>> 3117  3116
>> 3117  3116  3034
>> Workload division
>> nnodes:          14
>> pd->shift:        8
>> pd->bshift:       0
>> Nodeid   atom0   #atom     cg0       #cg
>>      0       0    9348       0      3246
>>      1    9348    9348    3246      3116
>>      2   18696    9348    6362      3116
>>      3   28044    9351    9478      3117
>>      4   37395    9348   12595      3116
>>      5   46743    9351   15711      3117
>>      6   56094    9348   18828      3116
>>      7   65442    9351   21944      3117
>>      8   74793    9348   25061      3116
>>      9   84141    9351   28177      3117
>>     10   93492    9348   31294      3116
>>     11  102840    9351   34410      3117
>>     12  112191    9348   37527      3116
>>     13  121539    9350   40643      3282
>>
>> Max number of connections per atom is 27
>> Total number of connections is 32076
>> Max number of graph edges per atom is 4
>> Total number of graph edges is 13572
>> Initial temperature: 303.113 K
>>
>> Started mdrun on node 0 Fri Aug 13 14:38:46 2010
>>
>>            Step           Time         Lambda
>>               0        0.00000        0.00000
>>
>> Grid: 10 x 45 x 10 cells
>>    Energies (kJ/mol)
>>        G96Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
>>     2.96784e+03    1.16372e+03    9.89996e+02   -1.07478e+01    2.63019e+04
>>         LJ (SR)        LJ (LR)   Coulomb (SR)   Coulomb (LR)       RF excl.
>>     3.11913e+05   -8.40005e+03   -2.19295e+06   -1.39523e+04   -3.36479e+04
>>       Potential    Kinetic En.   Total Energy    Temperature Pressure (bar)
>>    -1.90563e+06    3.31599e+05   -1.57403e+06    3.04444e+02    1.07070e+01
>>   Cons. rmsd ()
>>     2.63140e-05
>> \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
>>
>> Nothing more at the end,
>> I changed the 16 cores to 14 cores, still has the same problem. And when
>> I checked the status, it showed me it's finished.
>>
> 
> Please provide more information.  It mdrun stopped, it must have had a reason.
> Please provide your .mdp file, description of the system, and what you did as
> far as energy minimization and/or equilibration prior to whatever this run was
> doing.  If mdrun fails, there's always a reason, and it would be very odd if
> Gromacs simply stopped without providing some kind of indicative output.
> 
> -Justin
> 
> 
> 
> Can you tell me the possible reasons? Mainly which part I need to check?
> 

Based on what you posted before, there is no possible way to tell you what's 
wrong, and hence why I asked for more information.  Check all output - stdout, 
stderr, log files, whatever other files your queuing system (if applicable) 
might give you.  There are numerous reasons for mdrun to stop, but very odd that 
no error messages would be produced.  Without more information, it's just idle 
guesswork.

-Justin

-- 
========================================

Justin A. Lemkul
Ph.D. Candidate
ICTAS Doctoral Scholar
MILES-IGERT Trainee
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================



More information about the gromacs.org_gmx-users mailing list