bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why does close_stdout close stdout and stderr?


From: Assaf Gordon
Subject: Re: Why does close_stdout close stdout and stderr?
Date: Tue, 7 May 2019 03:44:55 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

Hello all,

joining a bit late to this discussion, but I'd like to add
another POV on why fclose is important and useful:

On 2019-04-29 1:45 p.m., Florian Weimer wrote:
I get that error checking is important. But why not just use ferror
and fflush? Closing the streams is excessive and tends to introduce use-after-free issues, as evidenced by the sanitizer workarounds.

There is at least one (sadly) common issue in my work, and that is
detecting failed writes due to full disk (or less common - user quota
limits).

It is common in bioinformatics to process large files ("big data" and
all) - a program reading a 400MB file and writing a 2GB file is common.
Larger outputs (e.g. 20GB) are also not rare.

Many of these programs write to STDOUT and expect the user to redirect
to a file or a pipe.
In other cases, even if the output goes to a file, progress reports and
error messages go to stdout/stderr.

For those who think disk-space is a non-issue, consider the case where
the processing happens on "the cloud" - there disk-space is at premium
cost, and it is common to spin "cloud" virtual machines with smaller
disks (when I write "smaller" it could be a 300GB disk - which can still
get full quite fast).

---

Forcing close+error checking on stdout AND stderr is currently the only
(simple?)  way to ensure the output was not lost due to disk-full errors
(and for brevity I ignore the low-level issues of write-backs, caching,
journaling, etc.).

I'm attaching a sample test program to illustrate some points.
The program writes to stdout/stderr then optionally calls fclose/fflush/fsync.

Note the following:

1.
Without fclose, "disk-full" errors are not detected, and information
is lost (either from stdout, e.g. program's output, or from stderr, e.g. program's error messages or logs or progress info):

  $ ./aa stdout none > /dev/full && echo ok || echo error
  ok

2.
If we force fclose on stdout, errors are detected and reported:

  $ ./aa stdout fclose > /dev/full && echo ok || echo error
  aa: fclose failed: No space left on device
  error

3.
If we force fclose on stderr with disk-full, error messages are lost,
but at least the exit code will indicate an error:

  $ ./aa stderr fclose 2> /dev/full && echo ok || echo error
  error

4.
"fflush" instead of "fclose" seems to work OK, but I do not know
if there are other side effects:

  $ ./aa stdout fflush > /dev/full && echo ok || echo error
  aa: fflush failed: No space left on device
  error

5.
Calling "fsync" has the disadvantage that it always fails on special
files (e.g. pipes/sockets/char-devs).
This results in weird behavior, where "fsync" will report errors
when no I/O errors actually happened:

  $ ./aa stdout fsync && echo ok || echo error
  Hello
  aa: fsync failed: Invalid argument
  error

  $ ./aa stdout fsync | tee log
  Hello
  aa: fsync failed: Invalid argument

This perhaps could be avoided if fsync is checked for errors,
then a fall-back to fclose? but that's more code, not less...

---

For these reasons, I strongly encourage to keep close_stdout + close_stderr.

regards,
 - assaf





Attachment: aa.c
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]