emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tangle option to not write a file with same contents?


From: Max Nikulin
Subject: Re: tangle option to not write a file with same contents?
Date: Sat, 30 Oct 2021 22:13:20 +0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0

On 30/10/2021 00:58, Greg Minshall wrote:

Some hash-based build systems are mentioned in that thread. Since that
time more more similar tools have appeared, e.g. buck,
reimplementations of DJB's redo https://cr.yp.to/redo.html

i think different people will settle on different build tools.

Greg, I see your point and often I am not happy to change established workflow as well. Partially a reason is that it requires some efforts. This particular issue should be handled in Org code. (Unfortunately it requires some efforts as well.) On the other hand, it may be treated in a more general way by external hash&cache build tool.

Actually I have no suggestion concerning particular build system. E.g. buck is too heavy (python+java), and my experience is not purely positive.

It seems `compare-buffer-substrings` has more logic than just byte to
byte comparison. Is it to handle alternatives for unicode character
alternatives? For tangled buffer it should be size that is checked
first...

you are right, it definitely makes sense to look first at size.  (which
is what, e.g., rsync(1) does.)  also, probably i needn't have mentioned
`compare-buffer-substrings` -- i was really just trying to suggest
"simple" (which maybe i anti-did?).

I think, `compare-buffer-substrings' is a good starting point. It is ready to use and I am not aware of obvious problems with it. (Can it happen that change of file encoding would be discarded since buffers are equal?) I just was curious whether the function performs size check. It does, but comparison is not identity test, so it is at the end of the function.

In the meanwhile I realized that check for modification by user should be performed *before* tangle, and hash to detect changes is appropriate for such purpose. I think, a copy of tangled file just to detect modification will cause some tension from users.

Comparison of earlier and current tangle results should be done at the end, so implementation should be independent. There is no point to use hash, size + byte to byte comparison is fast and reliable.

A subtle point partially discussed earlier is overwriting content of existing file vs. tangling to temporary file and atomic replacement. In most cases the latter is preferred. However if target file is open for debugging in an editor, content should be written to the existing file (preserving inode). It allows to preserve unsaved changes if the editor notifies user that file is modified.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]