Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed ou

coreutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed ou

From:	Carl Edquist
Subject:	Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Date:	Tue, 29 Nov 2022 15:48:08 -0600 (CST)

On Tue, 29 Nov 2022, Pádraig Brady wrote:

On 29/11/2022 17:32, Carl Edquist wrote:

 ...

 If this kind of detect-broken-output-pipe logic were added to filter
 utils generally, the above example (with 4 cats) would return to the
 shell immediately.


Right. Thanks for discussing the more general pattern.
I.e. that SIGPIPE doesn't cascade back up the pipeline,
only upon attempted write to the pipe.
So it's not really a tee issue, more of a general pattern.

So it wouldn't be wrong to add this to tee (by default),

but I'm not sure how useful it is given this is a general issue for allfilters.

That makes sense; though I suppose it would continue to become more usefulfor these types of cases as it gets added to more filters :)

Also I'm a bit wary of inducing SIGPIPE as traditionally it hasn't beenhandled well:


But wait now, are we talking about inducing SIGPIPE?

In the current patch for tee, I think the idea was just to remove anoutput from the list when it's detected to be a broken pipe, allowing teeto exit sooner if all of its outputs are detected to be broken.

Similarly for the general filter case (usually with only a single output),the idea would be to allow the program to exit right away if it's detectedthat the output is a broken pipe.

https://www.pixelbeat.org/programming/sigpipe_handling.html

Actually, to me, if anything this page seems to serve as motivation to addbroken-pipe checking to coreutils filters in general.


The article you linked ("Don't fear SIGPIPE!") discusses three topics:

1. First it discusses default SIGPIPE behavior - programs that do notspecifically handle SIGPIPE do the right thing by default (they getkilled) when they continue writing to their output pipe after itbreaks.

We wouldn't be changing this for any filters in coreutils. Writing to abroken pipe should still produce a SIGPIPE to kill the program. (Evenwith broken-pipe detection, this can still happen if input becomes ready,then the output pipe's read end is closed, then the write is attempted.)

tee is actually a special case (if -p is given), because then SIGPIPE doesnot kill the process, but the write will still fail with EPIPE and teewill remove that output.

We also wouldn't really be inducing anything new for programs earlier inthe pipeline. If they don't handle SIGPIPE, they will just get killedwith it more promptly - they will end up writing to a broken pipe onewrite(2) call sooner.

2. The "Incorrect handling of SIGPIPE" section discusses programs thatattempt to handle SIGPIPE but do so poorly. This doesn't apply to useither. Filters that add broken-pipe detection do not need to add SIGPIPEhandling. And programs that handle it poorly, earlier in the pipeline,will have their problems regardless. (Again, just one write(2) callsooner.)

3. Finally, the "Cases where SIGPIPE is not useful" section actuallyhighlights why we *should* add this broken-pipe checking to filters ingeneral.

The "Intermittent sources" subsection discusses exactly what we aretalking about fixing:

For example 'cat | grep -m1 exit' will only exit, when you type a lineafter you type "exit".

If we added broken-pipe checking to cat, then this example would behavelike the user would have wanted - typing "exit" would cause grep to exit,and cat will detect it's output pipe breaking, and exit immediately.

The other example about 'tail' was fixed already, as this kind of checkingwas added to tail, as we've discussed. It's a good start! The more utilswe add it to, the more will be able to benefit.

The "Multiple outputs" subsection is specific to tee, and if anythingperhaps suggests that the '-p' option should be on by default. That is,it makes an argument for why it makes sense for tee to avoid letting aSIGPIPE kill it, but rather only to exit when all the input is consumed orall the outputs have been removed due to write errors.

The "Long lived services" subsection is a generalization of what was justsaid about tee - namely that it makes sense that some programs want tocontinue after a failed write attempt into a broken pipe, and suchprograms need to handle or ignore SIGPIPE. This is true for such programsalready, and adding broken-pipe checking to a filter in the same pipelinedoesn't change that at all. (Again, it will just cause them to get aSIGPIPE/EPIPE *promptly* - one write call sooner - when the final consumerof the output completes.)

...

Or perhaps when you mention "inducing SIGPIPE", you are referring to howtail(1) does things currently (when it detects a broken output), byattempting raise(SIGPIPE) followed by exit(EXIT_FAILURE). It seems thisis just an attempt to make it look to the waiting parent process that taildied trying to write to a broken pipe (somewhat of a white lie). Mostlikely it could just exit(EXIT_FAILURE) without confusing the caller. Soif you'd like to avoid that, it's probably not actually necessary for tail(or other filters) to kill themselves with a SIGPIPE. But it's notharmful either. It just produces the same effect it would have on itsnext write attempt (either way it gets a SIGPIPE). It does *not* attemptto handle the SIGPIPE. (The linked article's "Incorrect handling ofSIGPIPE" is about programs that attempt to handle this signal, but do sopoorly.)

...

So, in summary, when it comes to the idea of adding broken-pipe checkingto tee and filters in general, I'm seeing only wins -- and things thatwon't change, but no losses :)

Well... Except that it's work to add this to each filter. But, to lowerthe bar there a bit, they would not all have to be done at once.

I can imagine you might want to have some common code, or at least acommon idiom, for doing this type of thing in each filter. I would pointout though a couple things. tail(1) is a somewhat of special case becauseit does the 'tail -f' thing of continually checking multiple input filesfor changes. (It doesn't quit when it hits EOF on input.) tee(1) is alsosomewhat special because it has multiple outputs, and it has to check foreach of them breaking. So the implementation for each of these willlikely be different from the more general filter case, where the inputs donot continue past their respective EOFs, and where there is only a singleoutput to check.

I had a peek at the tail(1) implementation (in check_output_alive()), andI see it does a poll just on stdout, with a 0 timeout. I imagine in themore general filter case, you might do poll on both input and output(s)together, with an unlimited timeout. The 0 timeout thing in 'tail -f'mainly makes sense because it continually loops through multiple inputs,checking for new data in files after EOF is hit. (That is, somethingunique to tail.)

... Sorry to see the poll thing is complicated by cross-platform behaviordifferences :(


...

My apologies for the long email...  Hopefully some food for thought! :)

Carl

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs, (continued)

Prev by Date: Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Next by Date: Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Previous by thread: Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Next by thread: Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Index(es):
- Date
- Thread