bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

leaks fd for internal functions but not external command


From: Chet Ramey
Subject: leaks fd for internal functions but not external command
Date: Thu, 25 Jul 2019 07:52:32 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 7/24/19 9:07 AM, Sam Liddicott wrote:
> Thanks for that thoughtful response.
> 
> * I understand that the design decision is to have variable file
> descriptors to stay open after per-command redirection

Yes.

> * I understand that implementation constraints make it impossible to do
> this uniformly (for external command redirection)

I'm not sure "implementation constraints" is the right characterization.
Performing redirections in child processes is how shells work.

> * I understand that it is difficult for the script author to detect which
> case his code will be

I don't think this is true. The documentation is pretty clear.

> The shell normally does a great job of hiding the difference between
> internal and external commands, so even though it's very well documented,
> most of the time the user doesn't need to be aware. This is great for the
> user, and according to the principle of least surprise.

This isn't quite the case. The shell isn't particularly obscure about
the fact that printf has to be a builtin for `printf -v' to work, for
instance.

> 
> The syntactic sugar of having bash select a free fd (which necessary for
> good composability of operations in complex script pipelines) is a great
> benefit, especially when mixing with older pipelines having fixed numeric fd.

Older shells, you mean. Yes, that was the original motivation for it.
POSIX makes the limitation implementation-defined, and bash has never had
it, so it's reasonably easy to choose something that won't collide. But
it's nice not to have to.

> You say that there are technical reasons why the syntactic sugar of also
> keeping the fd open can't be implemented uniformly.

Technical reasons in the sense of the relationship between parent and child
processes, I suppose you mean.

> I wonder if this puts unnecessary cognitive burden on the user, leading to
> reluctance to get the benefits, or to the introduction of latent bugs.
> 
> There is a case I explain below which can lead to a leaked fd being held on
> to by subsequently invoked external processes. Of course it will
> technically be the users fault but I'm looking at reducing the cognitive
> burdens that make such a fault ultimately inevitable.
> 
> The cognitive burdens of leaving the fd open are:
> 
> 1. It breaks the normal expectation that per-command redirects are limited
> to the scope of the command.

Yes, it does keep the file descriptor open beyond the command. That is what
makes the feature different.


> A naked exec already works to hold open a variable fd in a wider scope if
> that's what the scripter actually wants: exec {fd}>... ;

Sure, you can do that too.


> 2. As syntactic sugar it moves, not removes, the boiler-plate burden
> 
> This naked exec (see above) saved by the syntactic sugar in the case where
> the fd should remain open is offset by the naked exec now required in order
> to close the fd for the traditional case that the fd should not left open
> beyond the scope of the command.

So you're saying you need to close it explicitly with `exec'. Yes, that's
the `cost' of using the feature.

> 
> 3. The unmeetable cognitive burden is that in order to safely manage the
> previous two item, the user needs to know if the command will be external
> or internal or a function.

"Unmeetable?" Come on.

> 
> This makes it hard for the user depend on this feature, because it is not
> possible to be sure at script author time whether a command is external. It
> may have become a function, (due to export -f, source, etc) which affect
> the execution environment.

I agree that export -f can change the environment from what the script
writer expected. That one is on the user, not the script writer. But if
it's a concern, the script writer can always use `command'.

> 4. The inevitable propagation of leaked fd's
> 
> The knowing user can remember to always use an identity wrapper function to
> force treatment as external commands as internal functions in order to get
> uniform behaviour, and also explicitly close the fd afterwards.

Explicitly closing the fd would always work, in the case that the fd is
opened by the parent, but will not work in the case where redirections are
performed in the child process, since the parent will not set the correct
shell variable.


> But other users may not know to close the fd which was never apparent (due
> invoking an external command) but which becomes an fd leak when they
> combine with other bash features (functions wrapping of external commands,
> or export -f environment that does this unawares) and those leaked fd's may
> then be inherited by other invoked external processes which may hold on to
> them for some time.

You're mixing `users' with script writers here, unless you're talking
about running constructs like this as one-offs in an interactive shell,
which doesn't happen very often, if at all.

> 
> This contrived example minimises the pipeline fd contortions in order to
> show that when what was an external command then becomes an internal
> command, it can as a consequence result in an fd leak to external processes
> (bash+lsof+grep here) which may be long lived.

I think there's a misunderstanding here, and it's my fault. The {var}
syntax is identical to any other file descriptor that's the target of a
redirection used with `exec', and the close-on-exec flag remains unset.
There's no `leaking' that is any different than any other redirection.
Sorry for the mistake.


> I recognise what you say about past design decisions, but for the future,
> as it is hard to safely get the benefit of leaving the handle open for
> variable per-command redefines, even for users who know about it, I wonder
> if the syntactic sugar might be redefined to reduce the cognitive burden
> and widen the benefit for the most valued variable fd's feature.

Yes, leaving the fd open was the design decision. The idea was to more
closely mimic open(), giving the script writer a handle he could use to
manage the file descriptor without using `exec'. Other than that, the
semantics of the resulting file descriptor are the same as if it had
been chosen by the user and used with `exec'.

> 
> If the variable fd syntactic sugar were re-designed so that variable
> handles were also limited to the scope the command, the same as for
> external commands, the same as for numeric handles, then:

If I were redesigning it today, I might choose to do that. Back in 2009,
I made a different decision.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]