info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Smoke, FUD (was Re: CVS corrupts binary files ...)


From: Mark D. Baushke
Subject: Re: Smoke, FUD (was Re: CVS corrupts binary files ...)
Date: Tue, 29 Jun 2004 02:37:44 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Paul Sander <address@hidden> writes:

> >--- Forwarded mail from address@hidden
> 
> Rather than use a hint to expose an
> implementation detail, I suggest recording a
> data type instead. Maybe even a MIME type. Then
> provide a suitable mechanism to map data types
> to tools that are appropriate to the
> environment.

I have no fundamental objection to saving the MIME
type. I suggest that it may need to be inside of a
string to pass the syntax of rcsfile(5). I would
actually suggest that it might be useful to just
borrow both of the MIME media-type and charset
concepts. That might allow for a 

  "media-type text/plain;"
  "charset ks_c_5601-1987;"

on a given file... the defaults should probably
be "text/plain" and iso-8859-1 or utf-8

> BTW, CVS no longer uses rcsmerge; it co's the
> necessary versions and runs diff3 directly. So
> in a CVS context, pushing this capability down
> to RCS isn't really a requirement. However, I
> recognize the usefulness of doing so, and would
> not oppose such a feature. On the other hand,
> doing so will likely be a duplication of effort
> because CVS has client/server concerns that RCS
> does not, and that may necessitate a different
> implementation.

Yes, I am aware that CVS no longer uses rcsmerge.
However, Greg was suggesting that RCS
compatibility would be broken by an extension such
as the one outlined in the thought experiment I
provided, so I felt it reasonable to mention how
RCS itself used diff3 in the past.

> >Given that this would appear to be the desire of
> >at least a few folks out there who might want to
> >make CVS do a better job at merging structured
> >ASCII files such as XML or HTML format. And
> >further, that you seem to have objections to this
> >approach. And while I have known you to bring up
> >points I have overlooked in the past...
> 
> Not just structured ASCII files as you describe,
> but any file containing structured data for
> which a merge tool is available.

Ahh, but I am not really trying to suggest that
"binary files" are suitable in the general case
for CVS control. That is a separate argument.

That said, I suppose that a merge utility that
understands how to merge a file containing lines
in a non-ISO-LATIN character set might also fall
into the category of a diff3 replacement and that
such files might be considered 'binary' by some
programs.

> >This time around I just do not see anything that
> >would preclude such an approach of using an
> >external diff3 hint 'replacement' program for
> >doing a 'cvs update -jtag1 -jtag2' operation.
> 
> >I will stipulate that such a program will likely
> >need to live on the server and furthermore that it
> >would not be interactive. In the absense of
> >finding such a program, CVS would likely resort to
> >using diff3 as a fallback, so its arguments would
> >likely need to match those of the diff3 program
> >itself... at least to the extent that cvs currently
> >uses various arguments to diff3.
> 
> I don't believe that such a program MUST live on
> the server.

The changes needed to allow the client-side to do
a merge are very large. I am not willing to
stipulate an implementation that would allow CVS
to deal with an interactive merge operation for a
random 'cvs update' command. The repository would
have a lock open for too long in that case.

> Merge tools, like editors, have a way of
> becoming religious icons, in situations where
> users have a choice. Under such circumstances,
> it becomes important to have client side
> mappings between data types and merge tools.

Your arguments almost help to make a case in
Greg's favor against allowing a diff3 replacement.

The kind of flexibility you desire is not
something that I think makes sense to bolt into
the 'diff3' slot.

What you propose would potentially best be handled
with an entirely new kind of update paradigm.
Possibly the use of a CVS/Base/file file and a
'patch' that would bring CVS/Base/file up to the
latest version would be 'better' in this case...

> Additionally, I don't believe that merge tools
> necessarily need to be fully automated.

Here we do not agree. Without such automation,
lock contention on directories could get very
intense.

> After the relevant versions have been downloaded
> to the client (and the repository locks have
> been cleared), the merge tools can run
> interactively. However, I believe that CVS
> current intersperses merges with downloads, and
> that would need to change before interactive
> merges can be supported.

The current CVS operations all occur on the server
side prior to downloading patches to the client.

What you are suggesting is a fairly major overhaul
to the cvs client/server protocol and as such
there is probably a 'better' way to deal with this
than a 'simple' alternative table of diff3-style
programs to do alternative merger algorithms.

> Also, CVS currently relies on diff3-style
> mark-ups to warn the user when merge conflicts
> remain present at commit time.

Yes, I should have stated that a failed merger
will probably still need to leave markers not
unlike the existing conflict markers of the
current diff3 program.

> Though strictly speaking such warnings are not
> necessary, they are incredibly useful. And
> they'll be lost unless merge conflicts are
> recorded another way.

Actually, merge conflicts are already recorded in
CVS/Entries if the datestamp of the file is not
touched, it will still show up as a 'conflict'
on a 'cvs status' command.

> One way is to lists conflicts in a file stored
> in the CVS directory. At commit time, skip the
> scan for diff3 mark-ups and instead read the
> conflict list and compare mod times of the
> relevant files. If they have changed, assume the
> conflicts have been resolved.

This is sounding more and more ugly.

> >Let me state the scope of the thought experiment:
> 
> >Goal: Provide a means whereby a cvs administrator
> >may cause a program other than diff3 to be used
> >when doing merge operations as a part of a
> >three-way merge of files in a sandbox. This
> >program might be defined as a keyword used as the
> >value of a 'diff3hint' followed by an 'id' which
> >could be looked up in a table that cvs could keep
> >to determine which executable and any additional
> >arguments above the diff3 form arguments might be
> >required.
> 
> Again, I think that recording a data type is a
> more straightforward (or at least more easily
> understood) implementation.

Sure, that makese sense.

> >Assertion: The diff3 replacement must handle
> >all of the args that cvs normally passes to diff3.
> 
> Yes.
> 
> >Assertion: The diff3 replacement must not be
> >interactive in nature for client/server repository
> >uses.
> 
> Well, okay for the first implementation.  :-)

The other requirements you have outlined above
would take a lot more work and have a high
potential to get things wrong.

> >Assertion: The diff3 replacement must be able to
> >run just given the three versions of the file
> >without any other state.
> 
> Yes, but it would be nice to be able to pass in
> the version numbers for column headings or the
> like, if the tool permits.

Right. Of course, CVS does pass those arguments to
diff3, so there are no real problems there. My
point was that actually passing the MIME type or
other information into the new program would
probably NOT be possible.

> >Assertion: That cvs continue to write new RCS files
> >in adherence to the syntax defined in rcsfile(5), but
> >allowing the introduction of one or more new phrases
> >and associated id word values as allowed for by the
> >RCS format syntax.
> 
> Yes. Should the implementation support changing
> these values after they've been set initially?

Possibly. It may also be 'useful' to have each
version have a MIME-type with the entire file
having a default MIME-type for a newly added
version of the same thing as the predecessor
version. This would allow one branch to have a
file that is in English and another file that is
in Chinese.

> And are the set initially at the time the RCS
> file is created or at commit time?

I'd guess it would be handled much as the
- -k<value> set of switches are handled. You can
use the -k switch on 'cvs import' or 'cvs add'
or 'cvs admin' or 'cvs checkout' or 'cvs update'
and have it do something reasonable.

It is less clear how these attributes would need
to be stored in the CVS/Entries file if they were
ever used on a 'per checkout' basis.

> >It would be left to the extension designer to
> >determine the method whereby such a new RCS
> >phrase would be written into the CVS repository
> >versions of the files.
> 
> It's easier to set it when the file is created.

Possibly.

> CVS already writes RCS files in the proper
> format without using the rcs program to
> initialize them.

True, but it does depend on the -k<subst> values.

> The ci program doesn't permit the insertion of
> newphrases in its present form, so there's no
> good way at present to insert newphrases in the
> delta section of the RCS file at all.

Yes, this is true. It might need to be a 'cvs admin'
option on a per revision basis...

        -- Mark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFA4Tho3x41pRYZE/gRAhwBAJ917ge/wGEz+VsD13mF7j3zSy00IgCfQmIJ
YwP+aSCJELItbIC/MSo/gAY=
=I0tp
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]