info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Strange Characters


From: Mark D. Baushke
Subject: Re: Strange Characters
Date: Wed, 24 Mar 2004 23:38:12 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gagneet Singh <address@hidden> writes:

> We are using a CVS repository on a GNU/Linux Red Hat 9.0 installed
> distribution. The CVS server is 1.11.2 and the clients are all based on
> Windows platform where the development is being done. These are WinCVS
> version 1.3.10 and 1.3.13.
> 
> Now the problem is that for the characters - "'x'" (without the double
> quotes), when I checkout on a Windows system, they are being checked out
> properly, but when we checked out the same on a Linux system they
> appeared as - "~Qx~R" (without the double quotes).

If you use a hex or octal dump program to actually look at the text (on
the Linux box, the 'od' program with the -x switch will do the job), I
believe you will see that your Windows text editor is 'helpfully'
inserting characters other than the APOSTROPHE (') character which is
hex 0x27 (aka octal 047, aka decimal 39) and is instead going into some
a word processor mode to give you non-standard encodings for your
string.

The Microsoft Windows system has a non-standard character set that goes
by the name of 'Microsoft Windows Codepage 1252' which is also known as
CP1252. It uses most of the standard ISO-8859-1 character set encodings,
but extends things for a few encodings of its own.

My guess is that you are running into the problem of your editor being
'helpful' to you and mixing the concept of a LEFT QUOTE CHARACTER and a
RIGHT QUOTE CHARACTER with the APOSTROPHE character that you really want
for your C program.

See http://www.microsoft.com/typography/unicode/1252.htm for the full
table, but here is an extract:

 1252       Uindex      UISOname
 ...
 27         0027        APOSTROPHE
 ...
 91         2018        LEFT SINGLE QUOTATION MARK
 92         2019        RIGHT SINGLE QUOTATION MARK
 ...
 
The rest of your environment is probably trying to live in the
ISO-8859-1 (aka ISO LATIN 1) character set world.

There is probably a way to fix your Windows editor to avoid this problem
and use a straight ISO-8859-1 character set APOSTROPHE. However, I am
not a fellow that uses Windows and am not sure of the answer.

> I checked up in the Repository files lying on the server and I got the
> same text - "~Qx~R" (without the double quotes).
> 
> Is this the normal behaviour of the CVS repository files to change the -
> "'x'" (without the double quotes) characters into "~Qx~R" (without the
> double quotes) when getting the files from a Windows system to a
> GNU/Linux System.
> 
> The condition has been observed only for the lines which are inside the
> C block comment lines.
> 
> Actual file on Windows Systems:
> /* The underscore character '_' is to be replaced in the following code
> segment with '-' */
> if ('_' == UNDER_SCORE)
> {
> .
> .
> .
> }
> 
> 
> Actual file as seen in the GNU/Linux CVS Repository:
> /* The underscore character ~Q_~R is to be replaced in the following
> code segment with '-' */
> if ('_' == UNDER_SCORE)
> {
> .
> .
> .
> }
> 
> 
> Is there some way to rectify this problem so that the checkout in both
> the Windows and the GNU/Linux systems is the same??

The Windows world should be able to live with an ISO-8859-1 character
set. I suggest you try to change from CP1252 to ISO-8859-1 when writing
programs. There should be a way to do this, I just don't know what it
is.

        Good luck,
        -- Mark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFAYoxk3x41pRYZE/gRAkO3AJ9yFh5MedGzg0NSQpiBsf2cXgcgyACgxKGO
yoasCjt1oUGUvE+XadTLFLI=
=mlIE
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]