gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Re: semantics of restricted mkpatch


From: Denys Duchier
Subject: [Gnu-arch-users] Re: semantics of restricted mkpatch
Date: Tue, 07 Oct 2003 20:18:37 +0200
User-agent: Gnus/5.1003 (Gnus v5.10.3) Emacs/21.3 (gnu/linux)

Tom Lord <address@hidden> writes:

>   And ideally: it would be nice to begin to have a mathematical model of
>   how we think limits should work -- a model of what the user
>   documentation would say; a model that describes how users are supposed
>   to think about LIMITS.  Then, it would be nice to have a proof that
>   any proposed algorithm actually implements that model.

My idea of LIMITS is that they select what you are interested to put
into a changeset.  It is then the job of mkpatch to pull-in whatever
else is necessary in order to produce an integrity-preserving
changeset.

Indeed, I'd like to prove that my algorithm provides such a guarantee
(and fix it if it doesn't).

I apologize for the lengthy quoting below, but I think the examples
need to be preserved in my answer.

> * case 1 -- ignoring renames
>
>   Let's suppose that we have 
>
>       ORIG TREE:                      tag     modified
>
>               ./dir-A                 T1      
>                   ./dir-A/file-A      T2
>                   ./dir-A/file-b      T3
>
>       MOD TREE:
>
>               ./dir-B                 T1
>                   ./dir-B/file-A      T2      yes
>                   ./dir-B/file-b      T3      yes
>
>
>   and ask to compute a changeset with "LIMITS == { dir-B/file-A }"
>
>   The directory with tag T1 is _outside_ the scope of LIMITS and so
>   we'd expect the rename of that directory to be ignored.   The file
>   with tag T3 is outside of the limits -- changes to that file should
>   be ignored.
>
>   The changeset generated should be the same as if we had replaced 
>   MOD with:
>
>       LOGICAL-MOD TREE
>
>               ./dir-A                 T1
>                   ./dir-A/file-A      T2      yes
>                   ./dir-A/file-b      T3
>
>   and computed a changeset without any LIMITS.
>
>   If I follow your code correctly, it will correctly record 
>   changes to the T2 file -- but it will also include the 
>   rename of directory T3.

if you mean file T3, then no.  If you mean directory T1, then yes - at
least that was my intent.  I see now how your interpretation is also a
very reasonable option in this case because it is not necessary to
pull the directory rename into the changeset in order to obtain an
integrity-preserving changeset.  In other words, I did not produce a
MINIMAL changeset.

>   In general that seems like a bad 
>   idea.  Consider this variation:
>
>       ORIG TREE:                      tag     modified
>
>               ./dir-A                 T1      
>                   ./dir-A/file-A      T2
>                   ./dir-A/file-b      T3
>                 ./dir-B                       T4
>                   ./dir-B/file-c      T5
>
>       MOD TREE:
>
>               ./dir-B                 T1
>                   ./dir-B/file-A      T2      yes
>                   ./dir-B/file-b      T3      yes
>                 ./dir-C                       T6
>                   ./dir-C/file-c      T5
>
>    If your algorithm produces a changeset that records
>    the rename of directory T1, then it must also record
>    the addition of T6 and rename of T5 -- this seems like
>    a very strange result for "LIMITS = { dir-A/file-A }".

No, that is incorrect.  Here is my interpretation:

   - the limit matches

        dir-A/file-A    (causes dir-A to be "needed")

   - therefore we include in the changeset

        rename dir-A -> dir-B (pulled in because dir-A is "needed")
        modif dir-A/file-A -> dir-B/file-A

   anything else is neither selected not "needed"

> * case 2 -- detecting deletions
>
>   Suppose we had:
>
>
>       ORIG TREE:                      tag     modified
>
>       ./dir-A                         T1      
>           ./dir-A/dir-B                       T2
>             ./dir-A/dir-B/file-A      T3              
>
>       MOD TREE:
>
>       ./dir-C                         T1      
>           ./dir-C/dir-B                       T2
>
>
>    and we compute a changeset with "LIMITS == { dir-C/dir-B }"
>
>    Once again, the rename of the T1 directory should be ignored:
>
>
>       LOGICAL-MOD TREE:
>
>       ./dir-A                         T1      
>           ./dir-A/dir-B                       T2
>
>    so the changeset records the deletion of file T3.

Here is my interpretation:

   - the limit matches:

        dir-A/dir-B             (causes dir-A to be needed)
        dir-A/dir-B/file-A

   - therefore we include in the changeset

        rename dir-A -> dir-C
        delete dir-A/dir-B/file-A

(here, as you, for simplicity, I am leaving out considerations of
metadata files - but these have to be dealt with too)

> * case 3 -- impossible limits?
>
>   A slight variation on the previous example is:
>
>       ORIG TREE:                      tag     modified
>
>       ./dir-A                         T1      
>           ./dir-A/dir-B                       T2
>             ./dir-A/dir-B/file-A      T3              
>
>       MOD TREE:
>
>       ./dir-C                         T1      
>           ./dir-C/dir-B                       T2
>         ./dir-D                               T4
>           ./dir-D/file-A              T3      yes
>
>    with "LIMITS == { dir-C/dir-B }"

My interpretation:

   - the limits match:

        dir-C/dir-B             (causing dir-C to be needed)

   - therefore we include in the changeset:

        rename dir-A -> dir-C

     that's all (unless dir-A/dir-B -> dir-C/dir-B is considered a
     rename, I don't remember if arch considers that to be the case)

>    If you don't consider that example convincing (perhaps you want to
>    say that in this case the resulting changeset should be empty),

using an argument of minimality, yes, I'd say it should be empty.

> how
>    about this one:
>
>       ORIG TREE:                      tag     modified
>
>       ./dir-A                         T1      
>           ./dir-A/dir-B                       T2
>             ./dir-A/dir-B/file-A      T3              
>
>       MOD TREE:
>
>       ./dir-C                         T1      
>         ./dir-D                               T4
>           ./dir-D/file-A              T3      yes
>
>    with "LIMITS == { dir-C }".

My interpretation:

   - the limits match:

        dir-C
        dir-C/dir-D
        dir-C/dir-D/file-A

   - therefore we include in the changeset:

        rename dir-A -> dir-C
        new  dir-C/dir-D
        modif dir-A/dir-B/file-A -> dir-C/dir-D/file-A

   (again, maybe there is rename implied - I forget)

>    Presumably the changeset should show the deletion of dir-B but not
>    the addition of dir-D

it is the opposite it will record the addition of dir-D but not the
deletion of dir-B.

> * tricky to express limits 
>
>    Consider trying to limit mkpatch to a file deletion:
>
>       ORIG TREE:                      tag     modified
>
>       ./dir-A                         T1      
>           ./dir-A/file-A              T2
>
>       MOD TREE:
>
>       ./dir-B                         T1      
>
>    with "LIMIT == { dir-B/file-A }"
>
>    Conceptually, clearly, this should be the same as:
>
>       LOGICAL-MOD TREE:
>
>       ./dir-A                         T1
>
>    but in practice that's quite tricky.   How is mkpatch 
>    supposed to figure out that the file named in the limits 
>    ("dir-B/file-A"), which doesn't exist in the MOD tree, 
>    is the same file that had tag T2 in the ORIG tree?

In my interpretation, LIMITS apply to, and thus select from, both ORIG
and MOD.  You could use a limit dir-A/file-A if you want to include
file-A into the changeset eventhough it no longer exists in MOD.

> 1) It appears that you are pruning both the ORIG and the MOD trees
>    according to how the limits compare to the path names to files in
>    either tree.   I think that the limits specs should refer to paths 
>    in the mod tree only (with a partial exception for limits naming 
>    deleted files).

I know that was your preferred interpretation, but I disagree with
it.  I think we should be able to apply LIMITS to either or both
trees.  Since I did not want to create a complicated LIMITS language,
especially when experimenting, I have opted for a simple language
where LIMITS always apply to both trees.  We could certainly make the
language more expressive, but this is a completely orthogonal issues
to that of proving that the algorithm is integrity-preserving.  For
the purpose of our design, we can simply assume that LIMITS is a pair
of predicates: LIMITS.ORIG and LIMITS.MOD.

> 2) I don't see any code that handles the case of "impossible limits"
>    described above.

as explained, it is not at all impossible, but quite straightforward,
unless I missed the point which is entirely too likely.

> What I _fear_ about this code (simply because I haven't seen any math
> that would prove otherwise) is that it is capable of producing a
> changeset which would not apply cleanly to ORIG (which would be
> disasterous).

Yes, I understand this worry.  With your concern about minimality, the
problem seems to me harder than the one I tried to solve (but perhaps
this is only superficial).  If you forget about minimality, then there
is not much of a problem: you take a full changeset (therefore one
which is guaranteed to apply cleanly) and, (thanks in part to my
notion of a NEEDED directory) only throw out parts affecting entire
subtrees.  This needs to be formalized.  I'll work on it.

Thanks a lot for this thourough analysis.  I think we should proceed
by cataloguing the types of elements in a changeset and formalize
their dependencies.  Hopefully, in this fashion (e.g. using a closure
algorithm) we will be able to compute provably minimal
integrity-preserving changesets.

Cheers,

-- 
Dr. Denys Duchier
Équipe Calligramme
LORIA, Nancy, FRANCE




reply via email to

[Prev in Thread] Current Thread [Next in Thread]