[gmx-users] Segmentation fault while diagonalizing covariance matrix using g_covar

Mark Abraham mark.j.abraham at gmail.com
Wed Oct 22 05:30:26 CEST 2014


On Tue, Oct 21, 2014 at 11:46 PM, Mendez Giraldez, Raul <
rmendez at email.unc.edu> wrote:

> Dear gmx users,
>
> I am computing PCA on a system with 15476 atoms (a matrix of 46428x46428),
> using g_covar. The covariance matrix seems to be built, however it crashes
> few hours later after diagonalization takes place:
>
> Calculating the average structure ...
> trn version: GMX_trn_file (single precision)
> Reading frame       5 time 60250.000
> Reading frame      12 time 60650.000
> Readin
> Reading frame      70 time 63000.000
> Reading frame     130 time 66500.000
> Reading frame     190 time 69500.000
> Reading frame     800 time 10000.000
> Last frame        800 time 100000.000
>
> Back Off! I just backed up average.pdb to ./#average.pdb.2#
> Constructing covariance matrix (46428x46428) ...
> Reading frame       5 time 60250.000
> Reading frame      12 time 60650.000
> Readin
> Reading frame      70 time 63000.000
> Reading frame     130 time 66500.000
> Reading frame     190 time 69500.000
> Reading frame     800 time 10000.000
> Last frame        800 time 100000.000
> Read 801 frames
>
> Trace of the covariance matrix: 1.87421e+06 (u nm^2)
>
> Diagonalizing ...
> /nas02/home/r/m/rmendez/Work/Scripts/Essential_Dynamics_protein_ions.sh:
> line 8:  1013 Segmentation fault      (core dumped)
>  g_covar -s ${id}_${tag_gro}.gro -f ${id}_${tag_trr}.trr -v
> ${id}_protein_ions_eigenvector.trr -mwa -o ${id}_protein_ions_ED
>  -n ${id}_${tag_trr}.protein_ions.ndx <
> ~/Work/Data/RyR/g_covar_proteins_ions
>
>
> You probably would say it is a memory problem, however I am booking 800 Gb
> of RAM, and when I check the memory usage, it does not seem to go further
> than 16 Gb:
>

I'd say that's a suspiciously round number that might be the limit on the
actual virtual memory available, but you'll have to ask your cluster admins
about that.


> Accounting information about jobs that are:
>   - submitted by all users.
>   - accounted on all projects.
>   - completed normally or exited
>   - executed on all hosts.
>   - submitted to all queues.
>   - accounted on all service classes.
>
> ------------------------------------------------------------------------------
>
> Job <704235>, User <rmendez>, Project <g4934v_protein_ions_PCA>,
> Application <d
>                      efault>, Status <EXIT>, Queue <bigmem>, Command
> </nas02/ho
>
>  me/r/m/rmendez/Work/Scripts/Essential_Dynamics_protein_ion
>                      s.sh g4934v 100ns.last_40000
> helical_restraints.extended>,
>                       Share group charged </rmendez>
> Mon Oct 20 13:16:17: Submitted from host <killdevil-login2>, CWD
> </netscr/rmend
>                      ez/Work/g4934v>, Output File
> <g4934v_protein_ions_PCA.4.ou
>                      t>, Error File <g4934v_protein_ions_PCA.4.err>;
> Mon Oct 20 13:16:22: Dispatched 1 Task(s) on Host(s) <c-187-02>, Allocated
> 1 Sl
>                      ot(s) on Host(s) <c-187-02>, Effective RES_REQ
> <select[ ((
>                      ( hca_ready) && type == any))] order[mem]
> rusage[mem=800.0
>                      0] same[model] >;
> Tue Oct 21 04:35:44: Completed <exit>.
>
> Accounting information about this job:
>      Share group charged </rmendez>
>      CPU_T     WAIT     TURNAROUND   STATUS     HOG_FACTOR    MEM    SWAP
>   55002.44        5          55167     exit         0.9970    16G     16G
>
> ------------------------------------------------------------------------------
>
> SUMMARY:      ( time unit: second )
>  Total number of done jobs:       0      Total number of exited jobs:     1
>  Total CPU time consumed:   55002.4      Average CPU time consumed: 55002.4
>  Maximum CPU time of a job: 55002.4      Minimum CPU time of a job: 55002.4
>  Total wait time in queues:     5.0
>  Average wait time in queue:    5.0
>  Maximum wait time in queue:    5.0      Minimum wait time in queue:    5.0
>  Average turnaround time:     55167 (seconds/job)
>  Maximum turnaround time:     55167      Minimum turnaround time:     55167
>  Average hog factor of a job:  1.00 ( cpu time / turnaround time )
>  Maximum hog factor of a job:  1.00      Minimum hog factor of a job:  1.00
>  Total Run time consumed:     55162      Average Run time consumed:   55162
>  Maximum Run time of a job:   55162      Minimum Run time of a job:   55162
>
> Does anyone know why g_covar crashes and/or what is the expected memory
> usage for a 46428x46428 matrix ?
>

Likely, it's not able to get enough memory. No. You should be thinking
seriously about whether you can do a sensible analysis on all those degrees
of freedom. You can construct a memory estimate by observing the
requirements for g_covar on various subsets of your system (e.g. alpha C,
then backbone atoms).

Mark


> Thank you so much in advance,
>
> Raul
>
> Raul Mendez Giraldez, PhD
> Dokholyan Lab
> Dept. Biophysics & Biochemistry
> Genetic Medicine
> University of North Carolina
> 120 Mason Farm Road
> Chapel Hill, NC 27599
> US
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list