reproduce-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[task #15686] Removing original software URLs from Maneage?


From: Mohammad Akhlaghi
Subject: [task #15686] Removing original software URLs from Maneage?
Date: Wed, 10 Jun 2020 19:11:14 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0

URL:
  <https://savannah.nongnu.org/task/?15686>

                 Summary: Removing original software URLs from Maneage?
                 Project: Reproducible paper template
            Submitted by: makhlaghi
            Submitted on: Thu 11 Jun 2020 12:11:12 AM BST
         Should Start On: Wed 10 Jun 2020 12:00:00 AM BST
   Should be Finished on: Wed 10 Jun 2020 12:00:00 AM BST
                Category: Software
                Priority: 5 - Normal
                  Status: In Progress
                 Privacy: Public
        Percent Complete: 60%
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
                  Effort: 0.00

    _______________________________________________________

Details:

IMPORTANT QUESTION, PLEASE READ AND SHARE YOUR THOUGHTS.

Currently, when downloading the software tarballs, Maneage defaults to using
the software's own webpage, it will then use a set of backup servers
<http://git.maneage.org/project.git/tree/reproduce/software/config/servers-backup.conf>
(currently using Gitlab, maneage.org and akhlaghi.org in this order). 

But managing all the different software URLs (as the primary download URL) is
VERY ANNOYING and sometimes problematic because some use strange servers that
may use some Javascript. It also makes the software downloading list very
long, hard to read and prone to bugs.

The main reason we preferred the original software URL until now was that
using Gitlab as a source to download a few hundred megabytes of data was not
too ethical (I didn't feel comfortable occupying their traffic limits with
such large downloads, thus slowing down other developers), and our own servers
of maneage.org and akhlaghi.org are privately funded, and traffic plays an
important role in server costs (both are currently paid by myself privately).

Recently a good alternative occurred to me: Zenodo! It is precisely defined
for this type of job (space and traffic), while giving a reliable identifier
(DOI). So I uploaded all our current tarballs to zenodo.3883409
<https://doi.org/10.5281/zenodo.3883409>. This DOI will not be changed even
after we add new tarballs (newer versions of existing software or new software
overall): it will always point to the most recent version
<https://help.zenodo.org/#versioning>.

The only complication was that Zenodo doesn't have a default way of extracting
the most recent version's identifier, so I had to use a hack
<https://gitlab.com/maneage/project-dev/-/blob/from-zenodo/reproduce/software/shell/configure.sh#L1215>,
after consulting Zenodo developers (implemented in the from-zenodo
<https://gitlab.com/maneage/project-dev/-/tree/from-zenodo> branch). 

With this feature implemented, if the download from a software's server fails,
Maneage will automatically use the most recent uploaded version to Zenodo
(note that we will never remove a file from Zenodo, we will always only add to
it). 

Now that we have a reliable server for downloading software source codes
without any trouble, I am thus proposing to remove all the software-specific
URLs. 

What do you think?




    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/task/?15686>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]