[gmx-developers] src/tests/benchmark sets storage solutions
roland at utk.edu
Mon Jan 7 23:53:08 CET 2013
On Mon, Jan 7, 2013 at 5:06 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
> Putting the code there might work, but can we really host there:
> - a 100-150 Mb tgz with the regression test suite;
yes. We already do.
> - benchmark suite, probably gigabytes in size;
That seems rather big. Will we have a small and large one? Can't we reduce
the size significantly by using (automatically) genbox to generate big
boxes by multiplying small boxes (would also allow to do arbitrary weak
These, in fact already the first one, can easily cause terabytes of traffic
> a day. Don't they have some limit?
For that (and also binaries) github is not ideal. They recommend
S3/CloudFront but that of course costs money. Google code might allow it.
Depends on the specific size. Do we already know the final size? The
default file limit is 200M but it can be raised. SF has a 5GB limit so this
would hopefully work (also if it would be bigger than that we probably
anyhow would want to split it into parts). But SF has ads.
> On Mon, Jan 7, 2013 at 10:46 PM, Roland Schulz <roland at utk.edu> wrote:
>> why don't we put all downloads on github (or any other OpenSource site
>> (sourceforge, googlecode, ....)) and link to there from the download page?
>> That would solve the problem very easily.
>> On Mon, Jan 7, 2013 at 4:27 PM, Szilárd Páll <szilard.pall at cbr.su.se>wrote:
>>> In the future we plan to provide source code, regression tests,
>>> benchmark data set, and possibly some binary packages as downloads for
>>> GROMACS users. For reasons discussed earlier, with the 4.6 release the
>>> regression test suite and validation of GROMACS build/installation is
>>> becoming very important. A standardized benchmark set and pre-compiled
>>> binaries would also be a great benefit for many.
>>> However, there is one important aspect we need to consider. The source
>>> + regression tests suite is already close to 150 Mb which is non-negligible
>>> from the point of view of server load. The benchmark set will probably be
>>> split up in several packages, some of them gigabytes in size; binaries for
>>> the half 6-8 most important platforms will also potentially contribute to
>>> the server traffic.
>>> A deluge of large downloads hitting and possibly crippling our servers
>>> after the next major/minor release(s) is a very realistic scenario for
>>> which we need to be prepared. After discussing with with some developers
>>> (thanks Berk & Sander for the tips) the is the suggestion we came up with:
>>> * We should use a separate, dedicated machine/VM to host downloads
>>> (are we keeping the current ftp server for this purpose?).
>>> * The number of connections and downwind bandwidth for
>>> the various GROMACS downloads should be capped to protect the server.
>>> * We should provide and *encourage* the use of torrents for downloads,
>>> especially the large ones like the benchmark test suite.
>>> + We could consider having mirrors for some (larger) or all downloads.
>>> However, for this we would need someone to help out with (read: "donate")
>>> storage space + bandwidth!
>>> + We could consider commercial storage solutions *if* there are
>>> suggestions for relatively cheap and efficient ones.
>>> *: needs immediate decision/action;
>>> +: would be good to get feedback before 4.6.0.
>> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
>> 865-241-1537, ORNL PO BOX 2008 MS6309
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the gromacs.org_gmx-developers