guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Git-LFS or Git Annex?


From: Philip McGrath
Subject: Re: Git-LFS or Git Annex?
Date: Fri, 26 Jan 2024 23:31:24 -0500
User-agent: Mozilla Thunderbird

Hi,

On 1/24/24 10:22, Ludovic Courtès wrote:

The question boils down to: Git-LFS or Git Annex?

[...]

What’s your experience?  What would you suggest?


I have a few times had a problem for which I thought Git LFS might be a solution, and each time I have ended up ripping out Git LFS in frustration before long.

I have not used Git Annex. I have looked into it a few times, but each time I decided it was too complex or not quite suitable for my use-case in some way. On the other hand, I have heard good things about it from people who have used it: in particular, I believe Morgan Lemmer-Webber (CC'ed) used it to manage a large set of art history images.

The main thing in this context that still isn't clear to me from by reading so far is how sharing lists of remotes works with Git Annex. In plain Git, remotes are part of the local state of a particular clone, not distributed as part of the repository. For the objectives here, though, a lot of the benefit would seem to be having many copies in synchronized, possibly "special" remotes so that anyone trying to get the videos would have plenty of ways to get them. I'm not sure to what extent Git Annex does that out of the box.

I did see that Git Annex can use Git LFS as a "special remote".

There are also two other approaches I think would be worth at least considering:

1. Just use Git

While the limitations of Git for storing large media files are well known, I have found it to be good enough for several use-cases, and it has the strong advantage of not requiring additional tools. My impression is that a significant factor in people using Git LFS, in particular, is the limit on repository size imposed by the popular hosting providers. There are strategies within Git to avoid having to download unwanted artifacts, including creating branches with unrelated histories, shallow clones (e.g. --depth=1 --single-branch), partial clones [1][2][3] (e.g. --filter=blob:none), and sparse checkouts [4][5], with the later two being fairly new features.

[1]: https://git-scm.com/docs/partial-clone
[2]: https://git-scm.com/docs/git-clone#Documentation/git-clone.txt---filterltfilter-specgt [3]: https://git-scm.com/docs/git-rev-list#Documentation/git-rev-list.txt---filterltfilter-specgt
[4]: https://git-scm.com/docs/git-sparse-checkout
[5]: https://git-scm.com/docs/git-clone#Documentation/git-clone.txt---sparse

2. Mirror URLs

Another approach would be just to make each video available at a few URLs and have Guix origins with the list. If one of the available URLs were the Internet Archive, it would have a high degree of assurance of long-term preservation. I think the biggest downside is that this might not help much with managing the collection of videos.

Philip



reply via email to

[Prev in Thread] Current Thread [Next in Thread]