[gmx-users] mdrun_mpi seg fault if N_atoms/cpu > 4096 ?
David
spoel at xray.bmc.uu.se
Wed Nov 9 20:31:46 CET 2005
On Wed, 2005-11-09 at 18:46 +0200, Atte Sillanpää wrote:
> Hi,
>
> we have a system with 128 DPPC-molecules and a layer of water. All goes
> well with version 3.2.1 if the number of atoms per cpu is less than 4096.
> That is, we get a seg fault before any real md in the beginning:
This could be the old bug when you have no water on the first processor.
This has been fixed in 3.3, but you can also use the shuffle option as a
workaround (will also give you better performance)
> Parallelized PME sum used.
> Using the FFTW library (Fastest Fourier Transform in the West)
> PARALLEL FFT DATA:
> local_nx: 16 local_x_start: 0
> local_ny_after_transpose: 16 local_y_start_after_transpose 0
> total_local_size: 67584
> Center of mass motion removal mode is Linear
> We have the following groups for center of mass motion removal:
> 0: rest, initial mass: 159751
> There are: 4341 Atom
> Removing pbc first time
> Done rmpbc
> Started mdrun on node 0 Wed Nov 9 17:27:13 2005
> Initial temperature: 320.01 K
> Step Time Lambda
> 0 0.00000 0.00000
>
> However, with 8 cpu:s there's no problem. We get this on an Opteron
> cluster running Rocks 3.2.0, Power4 and Sun Fire 25k. (Crash with 16384
> atoms, but not with 16381) Also, there were no problems with Gromacs
> version 3.0.3.
>
> There should not be anything special in the *.mdp-file and it's parameters
> seemed not to influence the behaviour. Hasty analysis from the Sun Fire
> 25k core gives out the following:
>
> dbx -f mdrun_mpi core
>
> For information about new features see `help changes'
> To remove this message, put `dbxenv suppress_startup_message 7.3'
> in your .dbxrc
> Reading mdrun_mpi
> dbx: internal warning: writable memory segment 0xbcc00000[21331968] of
> size 0 in core
> core file header read successfully
> Reading ld.so.1
> Reading libmpi.so.1
> ...
> Reading libdoor.so.1
> Reading tcppm.so.2
> t at 1 (l at 1) program terminated by signal SEGV (no mapping at the fault
> address)
> 0x000aa43c: pbc_rvec_sub+0x001c: ld [%o0 + 8], %f8
>
> Any ideas on how to proceed? I'm sure people have done bigger systems with
> gmx per cpu?
>
> Cheers,
>
> Atte
> _______________________________________________
> gmx-users mailing list
> gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
--
David.
________________________________________________________________________
David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596, 75124 Uppsala, Sweden
phone: 46 18 471 4205 fax: 46 18 511 755
spoel at xray.bmc.uu.se spoel at gromacs.org http://xray.bmc.uu.se/~spoel
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
More information about the gromacs.org_gmx-users
mailing list