[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?
From: |
Jan Hudec |
Subject: |
Re: [Gnu-arch-users] Re: How does arch/tla handle encodings? |
Date: |
Sat, 28 Aug 2004 12:23:24 +0200 |
User-agent: |
Mutt/1.5.6+20040818i |
On Sat, Aug 28, 2004 at 04:18:54 +0300, Marcus Sundman wrote:
> On Saturday 28 August 2004 03:35, Robin Green wrote:
> > On Sat, Aug 28, 2004 at 01:56:20AM +0300, Marcus Sundman wrote:
> > > However, for this problem to go away completely it needs
> > > to be fixed in _all_ systems, including arch. When a piece of text is
> > > sent around as bytes _no_ link in the chain may throw away the encoding
> > > metadata.
> >
> > If you want that property
>
> Umm.. what property? That text files remain text files instead of turning
> into raw byte blobs? Yeah, I really do want that property.
UTF-16 will not work with 99.9999% of standard tools. That's because
utf-16 is not compatible with how standard C library handles strings.
It's far easier to forget that utf-16 was ever invented, than to rewrite
all those tools.
UTF-8 works is 99.99999% of standard tools right out of the box. Yes,
that does include diff and patch.
Note: in both cases, compilers and interpreters of about anything are
part of the "tools".
> > isn't the most sensible solution to put the encoding metadata _inside_ the
> > file, like xml does?
>
> Purists generally hate this solution of xml. Theoretically speaking it's
> wrong because you would have to interpret at least the beginning of the
> file to get information on how to interpret the file, thus creating a
> circular dependency paradox. Practically speaking it's wrong since it
> severely limits what encodings can be used, since the file would have to
> contain a byte sequence equivalent to a string like '<?xml version="1.0"
> encoding="utf-8"?>' encoded in ANSI X3.4-1986.
Which is *RIGHT* thing to do. You need to standartize the encoding at
leas a bit, lest you create an utter mess.
> That said, I'm personally not completely against this approach, but I
> haven't given it much thought. However, only few formats (anything besides
> sgml?) support this system. E.g., if you want a text file to contain only
> the string "hello world" then there is no way for you to use this approach.
And there is no other way transparent for transport.
> > Transcoding need not be a goal of a revision control system, since you
> > can just transcode files to and from the working directory with a
> > separate utility.
>
> I have never said that transcoding has to be done by a CMS/RCS. However, the
> system has to support this, at least by not throwing away the encoding
> info.
For all sane things, the encoding info shall be part of the data. And
thus not thrown away...
> After giving it a lot of thought (quite a while ago), I concluded that I
> would personally prefer a general filter plug-in system in the CMS/RCS.
> This way the logic can be standardized and centralized, moving the burden
> (and the responsibility) of setting up the filters from each developer to
> the project leader. This way you also won't have issues with different
> people using different platforms and/or clients. (Anyhow, this is only my
> personal opinion, and I wouldn't want to impose it on others.)
Getting quite somewhere else... Would be a nice idea. Though it's pretty
tricky to get that right.
-------------------------------------------------------------------------------
Jan 'Bulb' Hudec
<address@hidden>
signature.asc
Description: Digital signature
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, (continued)
- Message not available
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Aaron Bentley, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Aaron Bentley, 2004/08/29
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Miles Bader, 2004/08/29
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Zenaan Harkness, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Jeremy Shaw, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Robin Green, 2004/08/27
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Marcus Sundman, 2004/08/27
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?,
Jan Hudec <=
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Marcus Sundman, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Jan Hudec, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Marcus Sundman, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Marcus Sundman, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Marcus Sundman, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Aaron Bentley, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Aaron Bentley, 2004/08/28