[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: those funny non-ASCII characters
From: |
rusi |
Subject: |
Re: those funny non-ASCII characters |
Date: |
Fri, 1 Jun 2012 20:17:35 -0700 (PDT) |
User-agent: |
G2/1.0 |
On Jun 2, 2:06 am, Xah Lee <xah...@gmail.com> wrote:
> Xah wrote
>
> > > 〈Unicode BOM Byte Order Mark
> > > Hack〉http://xahlee.org/comp/unicode_BOM_byte_orde_mark.html
>
> > >http://www.unicode.org/faq/utf_bom.html#bom1
>
> On Jun 1, 9:26 am, rusi <rustompm...@gmail.com> wrote:
>
> > Seehttp://www.unicode.org/versions/Unicode5.0.0/ch02.pdf
> > (pg 36) "Use of a BOM is neither required nor recommended for UTF-8,
> > but may
> > be encountered in contexts where UTF-8 data is converted from other
> > encoding forms..."
>
> > More specifically the non-recommendation of
> > bom:http://www.unicode.org/faq/utf_bom.html
> > "Note that some recipients of UTF-8 encoded data do not expect a BOM.
> > Where UTF-8 is used transparently in 8-bit environments, the use of a
> > BOM will interfere with any protocol or file format that expects
> > specific ASCII characters at the beginning, such as the use of "#!" of
> > at the beginning of Unix shell scripts. "
>
> didn't i mention these 2 points exactly in the link i gave??
Yeah your own link says this: (as you know I often use and quote your
unicode pages :-) )
- In unix-like OSes, BOM for utf-8 conflicts with the Shebang (Unix)
hack.
- Many Window software add BOM to utf-8 files, e.g. Notepad.
But you also say
> If your lang spec says unicode, you have to support BOM mark
So I am not clear whats ur stand...
Let me make my own position clear:
The de jure unicode standard is set by the unicode consortium (or
whatever its called)
The de facto standard is set by microsoft and java
The two conflict