monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] monotone win32 crlf questions


From: Nathaniel Smith
Subject: Re: [Monotone-devel] monotone win32 crlf questions
Date: Fri, 2 Jul 2004 11:45:30 -0700
User-agent: Mutt/1.5.6i

On Wed, Jun 30, 2004 at 10:21:36PM -0600, Derek Scherger wrote:
> Has anyone else noticed that monotone diff seems to double up CRLF's on 
> windows? I haven't really tried to track this down yet, but I've been 
> tracking my current project at work (on winxp) in monotone just to see how 
> things go. Generally it's working great and the only "problem" I've had is 
> that every time I do a diff I seem to have double spaced code.  od -c on 
> the monotone diff output shows two \r\n \r\n sequences where there is one 
> in the source file.
> 
> The files themselves seem to be ok in that they don't appear to have 
> doubled up \r\n sequences.

I have a guess here: that diff algorithm operates on text lines.
I bet it's interpreting both "\r" and "\n" as line ending characters,
so "\r\n" looks to it like the end of one line, and then a blank line.
(And then after it performs the diff on these doubled-up lines, it
prints them out again with \r\n at the end of each, because you're on
windows.)

I'm not actually sure if this should be considered a bug or what.  You
don't have line-ending conversion enabled (see below), so Monotone has
no idea what line endings you intended to use in the files being
diffed, and "\r", "\n", and "\r\n" are all valid line endings in
different contexts.

This shouldn't effect anything except 'diff', though; in monotone, diff
is only used to produce pretty displays for the user.

> I have not defined the get_linesep_conv or get_charset_conv hooks and I'm 
> using monotone under cygwin on winxp.
> 
> Under linux my files seem to have \r\n line endings as well although I was 
> expecting that they would have only \n having read the following section in 
> the manual.
> 
> "Note that Line ending conversion is always performed on the internal 
> character set, when both character set and line ending conversion are 
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> enabled; this behavior is meant to encourage the use of the monotone's 
  ^^^^^^^
> "normal form" (UTF-8, '\n') as an internal form for your source files, when 
> working with multiple external forms. Also note that line ending conversion 
> only works on character encodings with the specific code bytes described 
> above, such as ASCII, ISO-8859x, and UTF-8. "

By default, these are both _disabled_; if you want Monotone to munge
around with your files, you have to ask it to explicitly, by defining
the get_charset_conv or get_linesep_conv hooks.  Monotone prefers to
store text in LF (= "\n") format, so in this case it sounds like you
might want to add on all your Windows boxen something like:
  function get_linesep_conv(name)
    -- Examples for leaving binary files untouched:
    if (string.find(name, "%.jpg")) then return {"LF", "LF"} end
    if (string.find(name, "%.png")) then return {"LF", "LF"} end
    return {"LF", "CRLF"}
  end

> Thinking about this a bit more though it seems that normalizing line 
> endings in source files would cause some rather serious problems with hash 
> computations. Files with \n line separators on a linux system would hash 
> wildly differently than the same files with \r\n line separators on a 
> windows system.

Yes, Monotone takes care to always hash the normalized version of each
file.  This means that in you're storing files in "\n"-mode in the
repository but have them in "\r\n"-mode in the working copy, then
sha1sum will no longer give you the correct hash... but Monotone will
still get it right.

-- Nathaniel

-- 
The Universe may  /  Be as large as they say
But it wouldn't be missed  /  If it didn't exist.
  -- Piet Hein




reply via email to

[Prev in Thread] Current Thread [Next in Thread]