monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: line endings with 0.31


From: hendrik
Subject: Re: [Monotone-devel] Re: line endings with 0.31
Date: Tue, 21 Nov 2006 09:48:54 -0500
User-agent: Mutt/1.5.9i

On Tue, Nov 21, 2006 at 03:29:12AM -0800, Larry Hastings wrote:
> 
> Once again, I reflect: if I had a time machine, /after/ I made my 
> millions, I'd go back and cause MS-DOS 1.0 to use just LF for EOL and 
> "-" for program options, freeing up "/" for use as the directory 
> separator under MS-DOS 2.0.  How many hours of our lives have been 
> wasted on these needless differences?  ("And, Tim, while we're on the 
> subject, we need to talk about these 'drive letter' things...")

Just for historical record, this famous Microsoft "incompatibility"
is actually a point in which they followed the official standard.
ASCII was designed without a newline symbol; instead, it had a
carriage-return -- which returned the teletype carriage to the
beginning of the current line, permitting it to be overwritten, and
a line feed character, which moved to the next line while leaving the 
carriage alone (thus effectively moving a cursor straight down).

As a convenience to people using keyboards, it was recommended that
input and text-editing software be able to accept a single carriage 
return as an indication to move to the start of the new line, 
generating both a CR and a LF.  But this was a suggestion as to what 
carriage return could do as an editing command, not a part of the 
character code.

It was Unix that broke from this convention, breaking the CR-LF
standard and using LF instead of CR as a newline character.

Some 8-bit version of ASCII have a newline controm character now, and 
Unicode has a code-point for it two bytes in UTF-8).  It seems never to 
be used.

This history lessone doesn't have much of an effect on monotone, but, 
much as I detest many of Microsoft's policies, this is one case where it 
wasn't their fault.

To avoid problems, I have found it useful to have all my programs 
recognise CR and LF as newline characters, and to ignore one CR 
following an end-of-line LF and to ignore one LF following an 
end-of-line CR.  In situation where my code is not the final consumer of 
the text, I preserve whatever line-end coding exists in the 
original, whether it follow one or another convention or is 
completely inconsistent.

In monotone, I suggest that a file that has been character-converted on 
checkout have its line-end codingw reverted on checkin, on a 
line-by-line basis.  Thus only when the user explicitly edits line ends 
will the end-of-line coding be changed in the repository.  This would 
have the effect that if massive damage is done to a true binary file if 
it is mistakenly line-end-converted, the damage would be mostly undone 
on subsequent checkin.

It is probably not the most convenient load to dump on the diff 
engine, though I can imagine algorithms.

-- hendrik




reply via email to

[Prev in Thread] Current Thread [Next in Thread]