bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: one last collating sequence data point


From: Chet Ramey
Subject: Re: one last collating sequence data point
Date: Sat, 27 Jan 2007 21:25:22 -0500
User-agent: Thunderbird 1.5.0.9 (Macintosh/20061207)

Bruce Korb wrote:

>> If LC_COLLATE is unset, LC_ALL and LANG both affect the collating order.
> 
> Neither of which were in the environment, but I didn't show that
> "conclusively".  "Trust me" (really).  In any event, why would it be
> that "bash" would use en_US and "ls" would use "C"?

Again, it has to do with what Posix calls the "native environment" (that
is the default locale).

One of the first things bash does when it starts is to call
setlocale(LC_ALL, ""), which returns the name of the default locale.  It's
set in an implementation-defined fashion using environment variables (or
to an "implementation-defined default."  The first time an instance of
bash starts up, at login, for instance, the call to setlocale selects the
native environment.  Bash doesn't export or set any of the locale variables
specially (though it will auto-export them if they're in the initial
environment), so if they are in the environment, either the user or the
"system" placed them there.

When the user modifies one of the locale environment variables, bash has to
reset the appropriate locale setting.  It does this by reproducing the
search order specified by Posix: LC_ALL, LC_???, LANG, and the native
environment.  Bash chooses the locale returned from the first call to
setlocale() as the native environment.

I could (and may) change bash to use "" to select the native environment
in the call to setlocale when none of the appropriate variables are set,
but that has problems of its own. setlocale() queries environment
variables, which requires bash to either replace getenv() (which it
attempts to do) or make `environ' point to bash's idea of its export
environment.  In either case, if bash is able to do it, only exported
variables will be returned.

There's no guarantee that this will work any better, either.  There's no
requirement that setlocale() call getenv() -- to satisfy Posix requirements
while still allowing library functions to be overridden, some systems do
the following:

        getenv(name) calls __getenv(name) to do the real work
        setlocale() calls __getenv to query environment variables

Sometimes this can be overcome by bash pointing environ to its own
exported environment, but other systems make `environ' an alias to the
`real' environment (e.g. (*_NSGetEnviron()).

On these systems (e.g., Mac OS X), the call to setlocale(var, "") will
end up querying the program's initial environment, regardless of any
changes.  You might even get an initial value of, for instance, LC_COLLATE
after you've unset it and changed LC_ALL or LANG.

Using "" to select the native environment would probably work better on
Linux, but nothing is going to work everywhere.  Maybe the best thing to
do is to call setlocale(var, "") only on systems that allow bash to replace
getenv().  There's no perfect solution.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                       Live Strong.  No day but today.
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]