bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: locale-dependent token separator handling doesn't work in multi-byte


From: Stephane Chazelas
Subject: Re: locale-dependent token separator handling doesn't work in multi-byte locales
Date: Wed, 8 Oct 2014 17:36:03 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

2014-10-08 09:17:18 -0600, Eric Blake:
[...]
> I would argue that locale-dependent parsing is probably a bug waiting to
> happen, and would be in favor of removing the feature and forcing the
> use of the C locale for the duration of parsing a script.  Yes, that
> means you can't write a variable name with non-ASCII characters, but as
> you've demonstrated, running such a script in a different locale than
> where it was written raises too many issues about what should happen.
[...]

Note that they're not the only problem. ksh arithmetics honours
the decimal point (which by the way when it's "," conflicts with
the "," operator), and of course there's a problem with
character ranges and classes.

the problem is that the shell and the utilities are both used as
tools by the user and as building blocks in the language used to
write scripts which means there's a conflict there.

[...]
> This may also be the sort of question worth asking the Austin Group
> about, to see if POSIX should be tightened on this front.
[...]

I agree, better before Chet starts to work on it. Chet says
that it's a POSIX requirement to have code parsed according to
locale and that's also my understanding after reading:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03

Given that yash is the only conformant shell in that regard
(though has issues IIRC), there's not much point POSIX requiring
it.

-- 
Stephane




reply via email to

[Prev in Thread] Current Thread [Next in Thread]