[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Bug of cvtenc (Re: ANN: GNUstep base/make Version 1.7.4)
From: |
Yen-Ju Chen |
Subject: |
Bug of cvtenc (Re: ANN: GNUstep base/make Version 1.7.4) |
Date: |
Sat, 20 Sep 2003 14:19:44 -0400 |
From: Adam Fedor <fedor@doc.com>
To: Yen-Ju Chen <yjchenx@hotmail.com>
CC: "bug-gnustep@gnu.org" <bug-gnustep@gnu.org>
Subject: Re: ANN: GNUstep base/make Version 1.7.4
Date: 20 Sep 2003 09:27:07 -0600
On Fri, 2003-09-19 at 22:20, Yen-Ju Chen wrote:
I don't understand. If eIn=YES, then it does use iEnc. I didn't change
anything about how the code read the input. If the input is in Unicode,
then the code correctly determines that, otherwise it assumes it is in
the default encoding (even though it may just be escaped unicode). In
that case you need to specify EscapeIn or EscapeOut to tell it which is
which.
O.K. For "-EscapeIn yes" (eIn == YES) case.
First it tests whether it is Unicode (0xFFFE || oxFEFF),
if so, iEnc = Unicode, oEnc is local encoding.
if not, iEnc = local encoding, oEnc is Unicode.
Then convert NSData to NSString using iEnc, which is correct.
Later, it convert evey \uXXXX into unichar in NSString.
Finally, it write the NSString to NSData using oEnc, then write to file.
The last step is wrong because it always use oEnc for writting.
In the case of iEnc = Unicode, it will be writted as local encoding.
In the case of iEnc = local encoding, it will be written as Unicode.
It is exactly opposite what user want.
User only want to use cvtenc to convert \uXXXX to their local character
so that they can easily modify the property list in local encoding
environment.
In the case, cvtenc should read and write in local encoding,
only convert \uXXXX to real character.
It's not about Unicode or not.
It's about cvtenc should use the same encoding for read and write.
\uXXXX is not Unicode. It is the presentation of Unicode in local
encoding.
Do you have a test file that will show me what is wrong?
I can send you a test in traditional Chinese (Big5 Encoding),
but it probably looks like unicode for you. :)
You can simply take any language in
SYSTEM/Library/Libraries/Resource/gnustep-base/Languages.
And according to the README in it,
you can use "cvtenc -EscapeIn yes French > tmpfile" to get a local
encoding file (ISO8895-1 ?).
From my test, you will get a Unicode (UCS-Internal or other UCS) file.
Yen-Ju
_______________________________________________
Bug-gnustep mailing list
Bug-gnustep@gnu.org
http://mail.gnu.org/mailman/listinfo/bug-gnustep
_________________________________________________________________
Instant message with integrated webcam using MSN Messenger 6.0. Try it now
FREE! http://msnmessenger-download.com
- Bug of cvtenc (Re: ANN: GNUstep base/make Version 1.7.4),
Yen-Ju Chen <=