[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CVS and unicode

From: Christian Hujer
Subject: Re: CVS and unicode
Date: Sun, 11 Sep 2005 14:12:10 +0200
User-agent: KMail/1.7.1


Am Sonntag, 11. September 2005 01:53 schrieb Pierre Asselin:
> Christian Hujer <address@hidden> wrote:
> > [ ... ]  The CRLF byte sequences are:
> > ASCII: 0x0D 0x0A.
> > UTF-8: 0x0D 0x0A.
> > UTF-16 LE: 0x0D 0x00 0x0A 0x00.
> > UTF-16 BE: 0x00 0x0D 0x00 0x0A.
> >
> > CVS will not interfer with any of these.
> > UTF-16LE sequence will be split within the LF char. But since the next
> > line will be split at exactly the same point, this is not a problem for
> > line diffs.
> An UTF-16 file can contain octet sequences like (xx 0D)(0A yy) that
> CVS will mistake for line endings.
Ah okay, true, I didn't think about this.

> It will confuse diff, and if 
> a Windows client strips the "0D" upon commit and a Unix client
> tries to update, the contents will look seriously scrambled...
The diff problem is valid.

The windows client problem is invalid since the client should not perform 
modifications on the files, wether -kb or not (imo).

Okay, UTF-16 is very likely to be problem if not treated as binary. UTF-16 
should therefor be added with -kb.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]