classpath
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnu.java.io.encode.EncoderUTF8.java


From: Per Bothner
Subject: Re: gnu.java.io.encode.EncoderUTF8.java
Date: Mon, 04 Aug 2003 12:30:48 -0700
User-agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4) Gecko/20030612

David P Grove wrote:

I've been tracking down a bug using classpath to run JSPs on top of Jikes RVM and I think the root of the problem is that EncoderUTF8.java is strictly following the UTF8 encoding scheme instead of the "pseudo-UTF8" that JVMs actually need. In particular, the character \u0000 is being encoded as the one byte 0 instead of the 2 byte sequence that Java uses.

I'm happy to contribute a bug fix for this. My question is should I change EncoderUTF8 to implement the Java treatment of \u0000,

That would be wrong.  EncoderUTF8 is used to convert 16-bit Unicode
to the *external* UTF8 encoding used for files etc.  Not the Java
pseudo-UTF8.

I can only think of one reason why you'd want to create the Java
pseudo-UTF8 format:  when writing a Java class file.  Implement
that however you wish, but don't change the behavior of EncoderUTF8.
You could add a flag to EncoderUTF8 file to enable "Java-style UTF8",
but it can't be the default.
--
        --Per Bothner
address@hidden   http://per.bothner.com/






reply via email to

[Prev in Thread] Current Thread [Next in Thread]