[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: patch to gnustep-base (Unicode and others)
From: |
Serg Stoyan |
Subject: |
Re: patch to gnustep-base (Unicode and others) |
Date: |
Mon, 8 Apr 2002 12:13:51 +0300 |
User-agent: |
Mutt/1.3.16i |
Hello, Richard Frith-Macdonald.
RFM> > RFM> Perhaps you can explain more ... as far as I cn see the above is
RFM> > RFM> simply wrong. Certainly initWithCString* methods are not
RFM> > RFM> supposed to convert to unicode (as a general rule), and
RFM> > RFM> OpenStep doesn't say they should - so I'm guessing you have
RFM> > RFM> some meaning in mind that is not immediately obvious to me.
RFM> >
RFM> > Here is the citation from "OpenStep Specification" (c) 1994 NeXT
RFM> > Computer Inc. Class NSString, page 2-127:
RFM> > "- (id)initWithCString:(const char *)byteString
RFM> >
RFM> > Initializes the receiver, a newly allocated NSString, by converting
RFM> > the one-byte characters in byteString into Unicode characters.
RFM> > byteString must be a null-terminated C string in the default C
RFM> > string encoding."
RFM>
RFM> OK ... guess I was wrong about that ... it *does* seem to say strings
RFM> should be converted to unicode ... but that's incorrect/misleading
RFM> documentation.
What documentation do you reccomend me to use? Apple's FoundationKit
and AppKit frameworks documentation? It says the same. What else?
Maybe i misunderstand you at some point?
RFM> If you look in the class description documentation, it tells you that -
RFM>
RFM> 'While the actual representation of character strings stored in NSString
RFM> and NSMutableString is independant of any particular implementation,
RFM> you can in general think of the contents of NSString and NSMNutableString
RFM> object as being, canonically, Unicode characters (defined by the unichar
RFM> data type)'
RFM>
RFM> Really, this means that you should not take the method descriptions too
RFM> literally, they are describing an API, not particular internal
RFM> implementation details.
You're right, but... I guess methods of the class have to interpret
its contents in the same way. For example, if dataUsingEncoding try to
convert *from* Unicode, initWithCString have to convert *into* Unicode.
You may replace "Unicode" with "Binary", no difference. I guess
Unicode is just portable way to store strings...
RFM> > RFM> > - adds 2 languages into Resources/Languages: Russian and
RFM> > RFM> > Ukrainian;
RFM> > RFM>
RFM> > RFM> Thanks, but I can't use them ... as I don't know what encoding
RFM> > RFM> you have created them in. I have added a README file to the
RFM> > RFM> Resources/Languages subdirectory to say what format language
RFM> > RFM> files *should* be in (and corrected some errors in the existing
RFM> > RFM> files).
RFM> >
RFM> > It's ok. I've just updated from CVS and created this files by
RFM> > cvtenc'ing them, just like README says. But... When i start any app
RFM> > i get this message:
RFM> >
RFM> > File NSDictionary.m: 458. In [GSDictionary -initWithContentsOfFile:]
RFM> > Contents of file
RFM> > '/home/stoyan/GNUstep/System/Libraries/Resources/Languages/Russian'
RFM> > does not contain a dictionary
RFM>
RFM> All I can suggest here is making sure you have the latest code installed.
RFM> I fixed a bug in loading 16-bit unicode property lists a day or two ago.
I have the latest CVS version. I start every day by 'cvs -z3 update' ;)
RFM> > Here is my some environment vars:
RFM> > [stoyan@localhost]$ echo $GNUSTEP_STRING_ENCODING; echo $LANG
RFM> > NSKOI8RStringEncoding
RFM> > ru_RU.KOI8-R
RFM> > I've attached Russian and UkraineRussian(conforming to Locale.aliases)
RFM> > files as well.
RFM>
RFM> Thanks, I've added them (I converted to ascii with \u escapes for
RFM> consistency with the other files, but that should make no difference).
By the way, you probably forgot to upload these files to CVS.
[some skipped...]
RFM> Property lists should be ascii ... so I prefer to keep an ascii property
RFM> list containing \u escape sequences for non-ascii character, and create
RFM> the other files temporarily (for editing) using cvtenc
RFM>
RFM> > In this case we use Unicode file, and proplist files remains for
RFM> > editors.
RFM>
RFM> But keeping multiple copies in different formats could let them get out
RFM> of sync with each other if you are not careful.
[some skipped...]
RFM> while unicode files are also portable, I'd still prefer to stick to
RFM> ascii files with \u escape sequences. That is, if we are sticking to
RFM> one portable format for consistency, I'd prefer it to be the ascii.
Completely agree at this point.
RFM> > PS: Another thing i've mentioned (and i guess should be somwhere in
RFM> > Documentation) is about using non-ascii characters when initializing
RFM> > NSString variable. I mean using such definition:
RFM> >
RFM> > NSString *some_string = @"some non-ascii characters";
RFM> >
RFM> > is deprecated. In this case string doesn't not converted into Unicode
RFM> > and results is unpredictable, or something.
RFM>
RFM> Well, OpenStep spec simply tells you not to do it (I'd say that's closer
RFM> to 'illegal' than 'deprecated') in the NSString class description.
You are right.
RFM> Where do you think this should be documented in GNUstep ?
Sorry, my mistake. No need to document it according to above you say.
--
Serg Stoyan