[gmx-developers] some benchs on cray xt4 and xt5
andrea spitaleri
spitaleri.andrea at hsr.it
Wed Dec 10 15:27:58 CET 2008
Dear Berk,
the highlight in this tests is the macroscopic difference between xt4 and xt5 using the same options
and input. My mdp is :
rlist = 1.0
rcoulomb = 1.0
fourierspacing = 0.15
pme_order = 4
rvdw = 1.0
Using 128 cpu on xt4 I get:
Domain decomp. 64 450001 283541.219 123285.0 1.7
Send X to PME 64 4500001 16471.763 7162.0 0.1
Comm. coord. 64 4500001 143325.633 62318.6 0.9
Neighbor search 64 450001 469227.820 204022.3 2.9
Force 64 4500001 4813100.223 2092757.0 29.3
Wait + Comm. F 64 4500001 770041.675 334817.5 4.7
PME mesh 64 4500001 6165375.373 2680732.2 37.6
Wait + Comm. X/F 64 4500001 2041930.805 887840.5 12.4
Wait + Recv. PME F 64 4500001 468701.376 203793.4 2.9
Write traj. 64 4553 1180.231 513.2 0.0
Update 64 4500001 250814.758 109055.4 1.5
Constraints 64 4500001 539024.234 234370.1 3.3
Comm. energies 64 4500001 317755.221 138161.4 1.9
Rest 64 134137.403 58323.5 0.8
I do not the statistic results on xt5 since I killed the job by the end.
The option -nosum should improve the performance on xt5 or both machine?
thanks in advance
andrea
Berk Hess wrote:
> Hi,
>
> Have you looked at the cycle counts at the end of the log files?
> I expect that most time is consumed by the energy summation
> when using that many cpu's.
>
> Try running with the option -nosum
>
> Also, if you are using PME, you need relatively long cut-off and a
> coarse PME grid
> for optimal performance, otherwise PME takes relatively too much time.
> I would use something like: cut-off = 1.2, grid_spacing=0.16
>
> Berk
>
> andrea spitaleri wrote:
>> Dear all,
>> I am using gromacs-4.0.2 on two systems: cray xt4 and xt5 (csc louhi). Here you are in short a table
>> with some tests:
>>
>> MD simulation 9ns on a system protein+water (ca. 200,000 atoms):
>>
>> 128 cpu 64 pme 15h 30min on hector (xt4)
>> 128 cpu 64 pme 15h 20min on louhi (xt4)
>> 128 cpu 64 pme 20h on louhi (xt5)
>>
>> 256 cpu 128 pme 12h on hector (xt4)
>> 256 cpu 128 pme 21h on louhi (xt5)
>>
>> One explanation should be (from one of the administrators):
>>
>> "One possibility for this is, that Gromacs is message intensive, and is
>> thefore slower on xt5 because of the xt5 architecture. (Basically 2
>> nodes (8 cores) share the same Hypertransport, whereas on xt4 each node
>> (4 cores) has that of its own, see eg.
>> http://www.csc.fi/english/pages/louhi_guide/hardware/computenodes/index_html )"
>>
>> what do you think about it?
>>
>> thanks in advance
>>
>> Regards
>>
>> andrea
>>
>>
>>
>>
>
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
--
-------------------------------
Andrea Spitaleri PhD
Dulbecco Telethon Institute
c/o DIBIT Scientific Institute
Biomolecular NMR, 1B4
Via Olgettina 58
20132 Milano (Italy)
http://biomolecularnmr.ihsr.dom/
Tel: 0039-0226434348/5622/3497/4922
Fax: 0039-0226434153
-------------------------------
More information about the gromacs.org_gmx-developers
mailing list