bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What to do about gnulib libio dependencies?


From: Bruno Haible
Subject: Re: What to do about gnulib libio dependencies?
Date: Wed, 22 Aug 2018 04:59:27 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-130-generic; KDE/5.18.0; x86_64; ; )

Zack Weinberg wrote:
> > I think it would clarify this discussion if you gave concrete examples
> > of existing programs that use these functions, and described what they
> > are doing with them that can't be accomplished using the standard
> > interfaces.

Of course, it makes sense to review
  - whether the functions are useful,
  - whether the API is adequate.

* freadptr and freadseek are performance boosters for applications that
  can benefit from dealing with an entire buffer at once, rather than
  reading and handling byte after byte. GNU m4 makes use of it and got
  a 17% speedup from it [1]. Other programs (from 'iconv' to JSON parsers)
  surely could make use of it too. So far, programs which want to handle
  entire buffers of input at once ignore the stdio and operate on file
  descriptors. I feel this is unfortunate: you should be able to use
  stdio AND get decent performance when possible.

  Also, gnulib uses this facility in its 'getndelim2' function, which is
  a generalization of 'getdelim' from glibc. Glibc's 'getdelim' implementation
  uses this trick already. It's a pity if functions like 'getdelim', when
  defined in applications, can not have the same speed as the same functions
  in glibc.

* freadahead currently has two uses:
  - It's used by the implementation of freadseek.
  - As an optimization that removes one system call just before the
    termination of most coreutils programs. (See [2] line 83.)

* fseterr is needed when an application wants to implement functions that,
  like fprintf, set the error indicator on a FILE stream in certain
  conditions (such as an invalid argument or out-of-memory).

  POSIX provides the 'ferror' and 'clearerr' functions; fseterr is in the
  same camp.

  There are ways to implement this function in a portable but very expensive
  way, see [3] lines 57..80. But no one wants such an expensive implementation.

* fbufmode

  A similar situation occurs with setvbuf: POSIX standardized the API to
  set the buffering mode and size. Glibc has a function __fbufsize to retrieve
  the buffer size, but no function to retrieve the buffering mode.

  Note: This function is not fully 100% portable: On native Windows, it is
  impossible to distinguish a stream in _IOFBF mode and a stream in _IOLBF
  mode.

  This function is not currently used by any application I know of. But it
  complements __fbufsize which is already in glibc.

Florian Weimer writes:
>  freadahead and freadptr are problematic for wide-oriented streams,
>  but we have ABI exposure for the read pointers already for the inline
>  copy of fputc_unlocked. The only caveat is that for fputc_unlocked,
>  we can provide compatibility by always having an empty read buffer
>  (at a cost to performance). With the other interfaces, this might
>  not be a possibility. 

In situations where you can't support freadptr and freadseek (e.g.
if the stream is unbuffered, or if you have chosen to store the bytes
in reverse order in memory, or XORed with some value, or whatever),
you can make freadptr return NULL. That's just an indicator to the
application that tells it "no optimization is possible - use a
classical fgetc loop".

Regarding freadahead, it does not constrain the implementation:
  "Returns the number of bytes waiting in the input buffer of STREAM.
   This includes both the bytes that have been read from the underlying
   input source and the bytes that have been pushed back through 'ungetc'."
The function just returns a number; there is no guarantee that the
bytes "waiting in the input buffer" are stored in a certain way or in
a certain place.

Carlos O'Donell writes:
> If we implement these interfaces in glibc can we avoid this situation
> happening again in the future?

You can at least try to avoid such situations by providing an API that
is well thought-out. Each time some code in glibc makes use of glibc
internals, ask yourself whether application programs can use the same
facilities and, if not, what they lose without these facilities.

Bruno

[1] https://lists.gnu.org/archive/html/m4-patches/2009-02/msg00012.html
[2] https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/closein.c
[3] https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/fseterr.c




reply via email to

[Prev in Thread] Current Thread [Next in Thread]