[gmx-users] Segmentation fault while diagonalizing covariance matrix using g_covar
Mark Abraham
mark.j.abraham at gmail.com
Wed Oct 22 05:30:26 CEST 2014
On Tue, Oct 21, 2014 at 11:46 PM, Mendez Giraldez, Raul <
rmendez at email.unc.edu> wrote:
> Dear gmx users,
>
> I am computing PCA on a system with 15476 atoms (a matrix of 46428x46428),
> using g_covar. The covariance matrix seems to be built, however it crashes
> few hours later after diagonalization takes place:
>
> Calculating the average structure ...
> trn version: GMX_trn_file (single precision)
> Reading frame 5 time 60250.000
> Reading frame 12 time 60650.000
> Readin
> Reading frame 70 time 63000.000
> Reading frame 130 time 66500.000
> Reading frame 190 time 69500.000
> Reading frame 800 time 10000.000
> Last frame 800 time 100000.000
>
> Back Off! I just backed up average.pdb to ./#average.pdb.2#
> Constructing covariance matrix (46428x46428) ...
> Reading frame 5 time 60250.000
> Reading frame 12 time 60650.000
> Readin
> Reading frame 70 time 63000.000
> Reading frame 130 time 66500.000
> Reading frame 190 time 69500.000
> Reading frame 800 time 10000.000
> Last frame 800 time 100000.000
> Read 801 frames
>
> Trace of the covariance matrix: 1.87421e+06 (u nm^2)
>
> Diagonalizing ...
> /nas02/home/r/m/rmendez/Work/Scripts/Essential_Dynamics_protein_ions.sh:
> line 8: 1013 Segmentation fault (core dumped)
> g_covar -s ${id}_${tag_gro}.gro -f ${id}_${tag_trr}.trr -v
> ${id}_protein_ions_eigenvector.trr -mwa -o ${id}_protein_ions_ED
> -n ${id}_${tag_trr}.protein_ions.ndx <
> ~/Work/Data/RyR/g_covar_proteins_ions
>
>
> You probably would say it is a memory problem, however I am booking 800 Gb
> of RAM, and when I check the memory usage, it does not seem to go further
> than 16 Gb:
>
I'd say that's a suspiciously round number that might be the limit on the
actual virtual memory available, but you'll have to ask your cluster admins
about that.
> Accounting information about jobs that are:
> - submitted by all users.
> - accounted on all projects.
> - completed normally or exited
> - executed on all hosts.
> - submitted to all queues.
> - accounted on all service classes.
>
> ------------------------------------------------------------------------------
>
> Job <704235>, User <rmendez>, Project <g4934v_protein_ions_PCA>,
> Application <d
> efault>, Status <EXIT>, Queue <bigmem>, Command
> </nas02/ho
>
> me/r/m/rmendez/Work/Scripts/Essential_Dynamics_protein_ion
> s.sh g4934v 100ns.last_40000
> helical_restraints.extended>,
> Share group charged </rmendez>
> Mon Oct 20 13:16:17: Submitted from host <killdevil-login2>, CWD
> </netscr/rmend
> ez/Work/g4934v>, Output File
> <g4934v_protein_ions_PCA.4.ou
> t>, Error File <g4934v_protein_ions_PCA.4.err>;
> Mon Oct 20 13:16:22: Dispatched 1 Task(s) on Host(s) <c-187-02>, Allocated
> 1 Sl
> ot(s) on Host(s) <c-187-02>, Effective RES_REQ
> <select[ ((
> ( hca_ready) && type == any))] order[mem]
> rusage[mem=800.0
> 0] same[model] >;
> Tue Oct 21 04:35:44: Completed <exit>.
>
> Accounting information about this job:
> Share group charged </rmendez>
> CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
> 55002.44 5 55167 exit 0.9970 16G 16G
>
> ------------------------------------------------------------------------------
>
> SUMMARY: ( time unit: second )
> Total number of done jobs: 0 Total number of exited jobs: 1
> Total CPU time consumed: 55002.4 Average CPU time consumed: 55002.4
> Maximum CPU time of a job: 55002.4 Minimum CPU time of a job: 55002.4
> Total wait time in queues: 5.0
> Average wait time in queue: 5.0
> Maximum wait time in queue: 5.0 Minimum wait time in queue: 5.0
> Average turnaround time: 55167 (seconds/job)
> Maximum turnaround time: 55167 Minimum turnaround time: 55167
> Average hog factor of a job: 1.00 ( cpu time / turnaround time )
> Maximum hog factor of a job: 1.00 Minimum hog factor of a job: 1.00
> Total Run time consumed: 55162 Average Run time consumed: 55162
> Maximum Run time of a job: 55162 Minimum Run time of a job: 55162
>
> Does anyone know why g_covar crashes and/or what is the expected memory
> usage for a 46428x46428 matrix ?
>
Likely, it's not able to get enough memory. No. You should be thinking
seriously about whether you can do a sensible analysis on all those degrees
of freedom. You can construct a memory estimate by observing the
requirements for g_covar on various subsets of your system (e.g. alpha C,
then backbone atoms).
Mark
> Thank you so much in advance,
>
> Raul
>
> Raul Mendez Giraldez, PhD
> Dokholyan Lab
> Dept. Biophysics & Biochemistry
> Genetic Medicine
> University of North Carolina
> 120 Mason Farm Road
> Chapel Hill, NC 27599
> US
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
More information about the gromacs.org_gmx-users
mailing list