[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
chinese encoded in UTF-8 and XML
From: |
Knackeback |
Subject: |
chinese encoded in UTF-8 and XML |
Date: |
25 Sep 2003 22:05:24 +0200 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.2 |
Hi, I wrote a XML file with GNU emacs 21.2.2 and with
chinese character content encoded in UTF-8.
I wrote something like:
<?xml version="1.0" encoding="UTF-8"?>
<test>
<chinese>撒</chinese>
<chinese>鰓</chinese>
</test>
and then I used "C-x RET f" and then I choosed utf-8.
Then I typed "C-x C-s" to save my file.
I hope this is the right way in emacs to store the content
as UTF-8 encoded text ?!
Now I tried to parse the file with xmllint. xmllint is a
small xml-parser program which comes with libxml2.
The parser complains that the second "chinese line" is not proper
UTF-8.
==>
uhu:4: error: Input is not proper UTF-8, indicate encoding !
<chinese>鰓</chinese>
^
uhu:4: error: Bytes: 0xC4 0xCE 0x3C 0x2F
<chinese>鰓</chinese>
It is interesting that the parser only grumbles about the second
chinese line.
I'm anxious to see an explanation !
- chinese encoded in UTF-8 and XML,
Knackeback <=