gwl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Managing data files in workflows


From: Ricardo Wurmus
Subject: Re: Managing data files in workflows
Date: Wed, 07 Apr 2021 13:38:37 +0200
User-agent: mu4e 1.4.15; emacs 27.2

Konrad Hinsen <konrad.hinsen@fastmail.net> writes:

> Looking at the source code in (gwl cache), restoring means symlinking
> the target file to the cached file, which can't work given that the
> cache is already a symlink to the target file.
>
> So... I don't understand how the cache is supposed to work. If it stores
> symlinks, there is no need to restore anything. If it is supposed to
> store copies, then that's not what it does.

Right, that’s really the heart of the problem here.  Originally, I used
hardlinks exclusively, but they don’t work everywhere.  So I added
symlinks, but obviously they have different semantics.

We can fix the problem with symlinks by restoring the target of the link
instead of the link itself, but I feel that we need to take a step back
and consider what this cache is really to be used for.

The cache assumes again that files are immutable when really they are
not guaranteed to be immutable.  Both symlinks and hardlinks don’t give
us any guarantees.

I really would like to have independent copies of input and output
files, but I also don’t want to needlessly copy files around or use up
more space than absolutely necessary.   We could punt on the problem of
optimal space consumption and simply copy files to the cache.

What do you think?

-- 
Ricardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]