[gmx-developers] src/tests/benchmark sets storage solutions

Szilárd Páll szilard.pall at cbr.su.se
Tue Jan 8 00:09:53 CET 2013


On Mon, Jan 7, 2013 at 11:53 PM, Roland Schulz <roland at utk.edu> wrote:

>
>
>
> On Mon, Jan 7, 2013 at 5:06 PM, Szilárd Páll <szilard.pall at cbr.su.se>wrote:
>
>>  Hi,
>>
>>  Putting the code there might work, but can we really host there:
>> - a 100-150 Mb tgz with the regression test suite;
>>
> yes. We already do.
>

If it work well/fast/easily enough, thats's fine. However, if they cap the
bandwidth, I'd still prefer having torrent alternatives as that will on the
long run be super-fast.


>
>
>>  - benchmark suite, probably gigabytes in size;
>>
> That seems rather big. Will we have a small and large one? Can't we reduce
> the size significantly by using (automatically) genbox to generate big
> boxes by multiplying small boxes (would also allow to do arbitrary weak
> scaling).
>

No, as that will lead to synthetic benchmarks which based on the
discussions we've had so far is not what many want to have
as representative GROMACS benchmarks. However, I would be in favor to
*aslo* include such a functionality.


> These, in fact already the first one, can easily cause terabytes of
>> traffic a day. Don't they have some limit?
>>
> For that (and also binaries) github is not ideal. They recommend
> S3/CloudFront but that of course costs money. Google code might allow it.
> Depends on the specific size. Do we already know the final size? The
> default file limit is 200M but it can be raised. SF has a 5GB limit so this
> would hopefully work (also if it would be bigger than that we probably
> anyhow would want to split it into parts). But SF has ads.
>

We've had a look at S3, but Sander calculated ~$30 for 1 TB traffic which
is a lot considering that we need only a few thousand downloads of the
source + regressiontests tarballs.

I am not a big fan of SF exactly because I can't just wget it but have to
go to their page and click.

All in all, I still favor torrent as the encouraged means of downloading
+ our own file server with capped uplink protection (which will make the
torrent in just a few days if not hours have tons of seeders and Mbytes/s
downloads speed.

--
Szilárd



> Roland
>
>
>>
>>  Cheers,
>>  --
>> Szilárd
>>
>>
>> On Mon, Jan 7, 2013 at 10:46 PM, Roland Schulz <roland at utk.edu> wrote:
>>
>>> Hi,
>>>
>>>  why don't we put all downloads on github (or any other OpenSource site
>>> (sourceforge, googlecode, ....)) and link to there from the download page?
>>> That would solve the problem very easily.
>>>
>>>  Roland
>>>
>>>
>>> On Mon, Jan 7, 2013 at 4:27 PM, Szilárd Páll <szilard.pall at cbr.su.se>wrote:
>>>
>>>>  Hi,
>>>>
>>>>  In the future we plan to provide source code, regression tests,
>>>> benchmark data set, and possibly some binary packages as downloads for
>>>> GROMACS users. For reasons discussed earlier, with the 4.6 release the
>>>> regression test suite and validation of GROMACS build/installation is
>>>> becoming very important. A standardized benchmark set and pre-compiled
>>>> binaries would also be a great benefit for many.
>>>>
>>>>  However, there is one important aspect we need to consider. The
>>>> source + regression tests suite is already close to 150 Mb which is
>>>> non-negligible from the point of view of server load. The benchmark set
>>>> will probably be split up in several packages, some of them gigabytes in
>>>> size; binaries for the half 6-8 most important platforms will also
>>>> potentially contribute to the server traffic.
>>>>
>>>>  A deluge of large downloads hitting and possibly crippling our
>>>> servers after the next major/minor release(s) is a very realistic scenario
>>>> for which we need to be prepared. After discussing with with some
>>>> developers (thanks Berk & Sander for the tips) the is the suggestion we
>>>> came up with:
>>>>
>>>>  * We should use a separate, dedicated machine/VM to host downloads
>>>> (are we keeping the current ftp server for this purpose?).
>>>> * The number of connections and downwind bandwidth for
>>>> the various GROMACS downloads should be capped to protect the server.
>>>> * We should provide and *encourage* the use of torrents for downloads,
>>>> especially the large ones like the benchmark test suite.
>>>> + We could consider having mirrors for some (larger) or all downloads.
>>>> However, for this we would need someone to help out with (read: "donate")
>>>> storage space + bandwidth!
>>>>  + We could consider commercial storage solutions *if* there are
>>>> suggestions for relatively cheap and efficient ones.
>>>>
>>>>  Legend:
>>>> *: needs immediate decision/action;
>>>> +: would be good to get feedback before 4.6.0.
>>>>
>>>>  Cheers,
>>>> --
>>>> Szilárd
>>>>
>>>
>>>
>>>
>>>   --
>>> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
>>> 865-241-1537, ORNL PO BOX 2008 MS6309
>>>
>>> --
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>> Please don't post (un)subscribe requests to the list. Use the
>>> www interface or send it to gmx-developers-request at gromacs.org.
>>>
>>
>>
>
>
> --
> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
> 865-241-1537, ORNL PO BOX 2008 MS6309
>
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20130108/46187e13/attachment.html>


More information about the gromacs.org_gmx-developers mailing list