help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: any plans for command substitution that preserves trailing newlines?


From: Chet Ramey
Subject: Re: any plans for command substitution that preserves trailing newlines?
Date: Wed, 26 Jan 2022 19:04:57 -0500
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.4.1

On 1/26/22 5:38 PM, Christoph Anton Mitterer wrote:

For instance, let's say the shell starts with an inherited
environment
variable LANG=fr_FR.UTF-8, and no LC_ variables. You can set the
shell
variable LC_ALL=C and the shell's locale settings will reflect that
(or
LC_CTYPE, or whatever, to set individual categories).

Not all shells will do that, but bash will.
But AFAIU, these shells would then violate POSIX in that aspect?!

No, there's no requirement. POSIX lists LC_ALL, LC_COLLATE, and LC_CTYPE
in the `evironment variables' section of the `sh' description, saying they
affect the shell's behavior. That's the standard description for
environment variables that affect setlocale().

It's pretty much a mixed bag among shells that claim some level of POSIX
conformance. Bash, yash, ksh93, yes. mksh, no. The ash-based shells are
split: dash, netbsd sh, no; freebsd-sh, gwsh, yes.


So if you want to temporarily control the locale a command
substitution, or
any program the shell runs, gets, you have to save, set, export, and
then
optionally reset all the variables you care about. I'm not saying
this
doesn't give you a lot of freedom, but you do have to think about how
environment variables and child processes affect it.

But even that's then basically not really guaranteed to work, is it?

I'm talking about bash.

Or is there a portable way to query the internal locale state of a
shell?

Not from outside the shell, no.


Consider nothing is set (neither LANG, nor LC_*)... a shell could have
set 'C' ... or 'C.UTF-8' ... or whatever.

Without any environment variables, you'll usually get the "system default
locale", what you get at program startup when you call setlocale(LC_ALL "").


If I now backup the old LANG/LC_* variables... most may not even be
set.
So there's no known value one can restore.

You have to enumerate all LC_* and LANG and use something like

old_LC_CTYPE=${LC_CTYPE-_unset_}

and test for that value later. If it's `_unset_', you unset it.

Or even if I backup the value and set/unset state of every LC_* value
before I set LC_ALL.

Exactly. But if LC_ALL was in the environment when this shell instance
started, you'll be modifying the locale that child processes will see.
So now you have to remember the export state and restore that too.

Once I restore afterwards, there's no guarantee that if I e.g. unset
LC_TIME again, that this unsetting will cause the internal state of
some shell to go back to whatever it was before.

It will depend on LC_ALL and LANG, like always.

Or maybe a shell anyway just acts when one of the LC_* var's is set,
not when it's unset.

No, it has to act on assignment and when it's unset.


Even if I try to be smart, and:
1st set all LC_* to LANG (if that was set before)
2nd set all LC_* (except LC_ALL) to their old value (or unset them if
     they were
3rd set LC_ALL to it's old value (or unset it)

it could just be that that last unsetting of LC_ALL sets e.g.
everything back to C, depending on how a given shell behaves.

I'd say that shells that understand locale variables all do the same thing,
and shells that don't are too much trouble to try.



And the trick of:
local LC_ALL=C
in some function, shouldn't work either, cause it would also set all
locale categories shell-wide?

It will, but they'll be restored when the function returns. That's what I
meant by letting the shell do it for you.


So the best thing one can do is to hope, that a given shell does it
rigt, simply be unsetting LC_ALL... as bash does?!

I don't see how that would work, and I'm not sure what you mean by "as bash
does." The only thing you really need to do is to set and reset LC_ALL
around the single assignment statement that removes the last byte from the
string. If you have a shell that understands locale variables, that will do
the right thing. If you don't, well, then that shell probably performs all
its word expansion operations on bytes anyway.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]