[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: utf-16le vs utf-16-le
From: |
David Kastrup |
Subject: |
Re: utf-16le vs utf-16-le |
Date: |
Mon, 14 Apr 2008 22:58:49 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) |
Stefan Monnier <address@hidden> writes:
>>> > I don't know, in fact I think I think [having BOM-specific coding
>>> > systems is] a bad idea. That's what the part of my message that
>>> > you snipped was saying. But I'll have to defer to Handa-san on
>>> > that.
>>>
>>> I think it obvious: if a BOM mark gets detected on read, one wants
>>> to have it removed from the buffer and reinserted on saving the
>>> buffer.
>
>> I agree, as you state it, it's obvious. My question is "why does that
>> need to be part of the coding system?" At present the UTF-16 and
>> UTF-32 Unicode coding systems (in the abstract) have *twenty-seven*
>> variants each (BOM-required, BOM-prohibited, BOM-autodetected X be,
>> le, system-dependent X CR, LF, CRLF), and UTF-8 needs *nine*. This is
>> nuts, from a user-education standpoint.
>
> For what it's worth, I do think it would make sense to try and move
> the BOM-processing outside of the coding-system proper. For me a good
> test for coding-system-worthiness is "what if I use it for a process
> rather than a file". Based on this test, I'm not sure if BOMs really
> fit in (other than for auto-detection and automatically stripping
> them, maybe).
Hm? I don't see why starting communication with a BOM or not would
_not_ fit in.
>> What I proposed was a more generic concept where use of signatures
>> and the EOL convention would (at least to the user) appear as
>> buffer-local variables.
>
> Here, I disagree: EOL processing definitely need to take place when
> talking to subprocesses, so EOL-handling doesn't belong in
> buffer-local vars but in the coding-system.
I don't quite see the difference to BOM processing, even though the BOM
processing has to happen only once at the start.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
- Re: utf-16le vs utf-16-le, (continued)
- Re: utf-16le vs utf-16-le, Stephen J. Turnbull, 2008/04/16
- Re: utf-16le vs utf-16-le, Eli Zaretskii, 2008/04/16
- Re: utf-16le vs utf-16-le, Stephen J. Turnbull, 2008/04/17
- Re: utf-16le vs utf-16-le, Jan Djärv, 2008/04/17
- Re: utf-16le vs utf-16-le, Eli Zaretskii, 2008/04/17
- Re: utf-16le vs utf-16-le, Stephen J. Turnbull, 2008/04/17
- Re: utf-16le vs utf-16-le, Eli Zaretskii, 2008/04/17
- Re: utf-16le vs utf-16-le, Eli Zaretskii, 2008/04/16
- Re: utf-16le vs utf-16-le, Stefan Monnier, 2008/04/16
- Re: utf-16le vs utf-16-le, Stefan Monnier, 2008/04/14
- Re: utf-16le vs utf-16-le,
David Kastrup <=
- Re: utf-16le vs utf-16-le, Stefan Monnier, 2008/04/14
- Re: utf-16le vs utf-16-le, David Kastrup, 2008/04/14
- Re: utf-16le vs utf-16-le, Stefan Monnier, 2008/04/14
- Re: utf-16le vs utf-16-le, David Kastrup, 2008/04/15
- Re: utf-16le vs utf-16-le, Stefan Monnier, 2008/04/15
- Re: utf-16le vs utf-16-le, Stephen J. Turnbull, 2008/04/14
Re: utf-16le vs utf-16-le, Kenichi Handa, 2008/04/14
Re: utf-16le vs utf-16-le, tomas, 2008/04/14