|
From: | Fred Kiefer |
Subject: | Re: [bug #4658], and also [bug #4624] |
Date: | Thu, 04 Sep 2003 19:35:45 +0200 |
User-agent: | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021204 |
Hi Pete, Pete French wrote:
O.K. - can people please test the attached patch ? It is for Unicode.m and adds support for the characters outside the BMP. These are converted into paired surrgates internally, and then converted back correctly on output. This code behaves exactly the same as OSX does in this respect, and is a fix for bug #4624. Hopefully it should also help with Freds changes for bug #4658 as well,hence the quoted line at the top. I dont have a system with new gpbs on it so I cant test whether it fixes that problem or not though at the moment. NB: It should also *not* break anything which currently works properly! Thats the main thing I would like testing! ANy wierd UUTF8 behaviour please let me know. Of course, adding this does raise quite a lot more issues - namely how many other routines should be made UTF-16 aware ? :-)
I fear that things are not as simple as our treat them. Your patch may be fine for the conversion between UNICODE and UTF8, but what about all the other conversions? If we change the internal storage of UNICODE from UCS2 to UTF16 all the other conversions should know about this. No big deal for the iconv conversions, as here we would just change the name used for the UNICODE conversion (somethign that is missing in your patch). But what should we do for the other converisions? As far as I can tell these other encodings don't include any characters that are not on the BMP, so we would just need to ignore the extra on loose encodings. With does resolved I, I would see no problem with your patch.
Cheers Fred
[Prev in Thread] | Current Thread | [Next in Thread] |