gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces


From: Andrew Suffield
Subject: [Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...)
Date: Thu, 22 Jan 2004 01:51:00 +0000
User-agent: Mutt/1.5.5.1+cvs20040105i

On Wed, Jan 21, 2004 at 05:20:29PM -0800, Tom Lord wrote:
>     > From: Andrew Suffield <address@hidden>
> 
>     >> Sorry.  I meant all characters that have been electrically encoded
>     >> in a standard character set.  As far as I know unicode does do
>     >> that.
> 
>     > Nope, not at all. See the previous message, 'han unification'.
> 
> Let's be pedantic.   What you two are really disagreeing about is the
> meaning of the word "character".
> 
> My understanding is that there are certain characters (in one sense of
> the word) which are common to Chinese, Japanese, and Korean.   There
> are, broadly speaking, four different styles of rendering these
> characters as glyphs -- two for Chinese (traditional and simplified),
> and one each for Japanese and Korean.   That is to say, there are four
> different ways of drawing these characters.

It's not quite that simple - there are multiple, similar-looking ways
of writing the same character within the same language in some
cases. Usually it doesn't matter, but for some things it does - names
are a good example. For a person's given name, writing the character
differently is akin to spelling it differently in English, and Han
unification is akin to declaring that from now on, all people with
names like "Tom", "Thom", or other derivatives of "Thomas" will
henceforth be called "Thom".

The Eastern countries are pretty serious about etiquette, and using
the wrong writing for somebodies name could easily tip the balance
between a contract going to you, or to your next competitor.

> A single font can render each these characters in a way such that all
> users will be able to recognize and read them.  Linguists would (so I
> hear) generally agree that, though they may be written in four
> different styles, these are each a single character.
> 
> No single font can render each of these characters in a way that will
> seem "natural" to all users -- a single font can only make them
> legible.  For "natural" rendering, you would want to use one font for
> Japanese text, another for simplified Chinese, and so forth.

If you don't code them as the same character, then having a font that
uses the proper writing for them all is easy. Mozilla under X, for
example, does it pretty well so long as you don't use unicode and have
enough fonts installed - it'll pick a font that matches the character
set of the web page.

If you use unicode, there is no way to tell which font is the right
one to use. Sometimes the application is going to pick the wrong one,
and the result is an awful ugly mess. FroM an aEsthEtic pErspEctive, a
docuMEnt whErE soME of thE charactErs usE the ChinEse style and the
rEst usE the JapanEsE is fairly siMilar to a docuMEnt where randoM
charactErs have had their casE flippEd. You can parse it, but you
don't *want* to.

The unicode "solution" to this is for Chinese users to use Chinese
fonts, Japanese users to use Japanese fonts, and neither to interact
with the other, which quite neatly defeats the point of unicode.

>     > > It isn't perfect and it certainly is not complete when you
>     > > consider all forms of writing humans have ever used, but it is
>     > > maintained, it works at least as well as anything else out there.
> 
>     > Doesn't do that either, if you happen to be Chinese, Japanese, or
>     > Korean.
> 
> As nearly as I can tell, opinions vary about that.  That is to say
> that there are some Chinese, Japanese, Koreans, and certainly plenty
> of others who would disagree with asuffield, here.

I don't think any rational CJK users would agree that unicode does
everything that the native character sets do. Some are willing to make
this tradeoff, and some are not (which is not the same thing). There
is a not insignificant amount of "I'm willing to put up with this, but
there is *NO WAY* my boss is going to accept it".

-- 
  .''`.  ** Debian GNU/Linux ** | Andrew Suffield
 : :' :  http://www.debian.org/ |
 `. `'                          |
   `-             -><-          |

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]