bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: proposed new module breadlinkat


From: Bruno Haible
Subject: Re: proposed new module breadlinkat
Date: Tue, 29 Mar 2011 12:29:45 +0100
User-agent: KMail/1.9.9

Hi Paul,

> In seeking to have GNU Emacs use gnulib's areadlink module I
> ran into a problem.  Emacs has its own xmalloc and xfree
> functions that block interrupts, and it can't simply use
> areadlink since areadlink uses malloc without blocking
> interrupts.  One workaround would be to block interrupts for
> the entire duration of the areadlink, but that would be bad
> because the underlying readlink may take some time
> (especially under NFS).  Better would be a variant of
> areadlink that lets Emacs uses its own xmalloc, xfree, etc.
> ...
> +char *
> +breadlinkat (int fd, char const *filename,
> +             char *buffer, size_t buffer_size,
> +             void *(*pmalloc) (size_t),
> +             void *(*prealloc) (void *, size_t),
> +             void (*pfree) (void *),
> +             void (*palloc_die) (void),
> +             ssize_t (*preadlinkat) (int, char const *, char *, size_t))

That looks like a lot of complexity that is useful only to Emacs and
not to other packages.

Why is it only useful to Emacs? The explanation of "blocking interrupts"
in emacs/src/blockinput.h [1]

  "When Emacs is using signal-driven input, the processing of those
   input signals can get pretty hairy.  For example, when Emacs is
   running under X windows, handling an input signal can entail
   retrieving events from the X event queue, or making other X calls.

   If an input signal occurs while Emacs is in the midst of some
   non-reentrant code, and the signal processing invokes that same
   code, we lose.  For example, malloc and the Xlib functions aren't
   usually re-entrant, and both are used by the X input signal handler
   - if we try to process an input signal in the midst of executing
   any of these functions, we'll lose."

sounds like an ancient (signal based) approach to multiprocessing.
Nowadays thread primitives exist and work on all platforms. The last
platform which did not have working multithreading was Cygwin 1.5.x,
which is obsoleted by Cygwin 1.7 for more than one year. Even DJGPP
has threads, via GNU pth [2].

Why do I find that this is a lot of complexity? Because all that you
effectively need is a patch like this:


--- lib/areadlink.c.orig        Tue Mar 29 13:00:03 2011
+++ lib/areadlink.c     Tue Mar 29 12:59:48 2011
@@ -39,6 +39,11 @@
 #undef malloc
 #undef realloc
 
+/* Inside Emacs, use xmalloc, xrealloc, because the system functions don't
+   block input.  */
+#define malloc xmalloc
+#define realloc xrealloc
+
 /* The initial buffer size for the link value.  A power of 2
    detects arithmetic overflow earlier, but is not required.  */
 enum {


If you store this file in a file emacs/gnulib-overrides/lib/areadlink.c.diff
and pass the option --local-dir=gnulib-overrides in the variable
GNULIB_TOOL_FLAGS in Makefile.in, you're done.


If you still want to pursue the 'breadlink' approach, then here are three
further suggestions:

  - Choice of a good name: It's not clear what the prefix 'b' means.
    The 'a' in 'areadlink' means "allocating". As I understand it here,
    the purpose is a "readlink with customizable allocation". So
    'careadlink' would be a better name, no?

  - About the API: Aren't the 4 function pointers pmalloc, prealloc,
    pfree, palloc_die related? Other functions with customizable allocation
    will want to use the same quadruple. Therefore it would make sense
    to group the 4 function pointers in a single 'struct'. In C++ this
    concept is called "allocator" [3].

  - Update the module dependency list in lib/relocwrapper.c and the file list
    in modules/relocatable-prog-wrapper.


>  1.  areadlink.c does '#undef malloc', '#undef realloc' but
>      areadlinkat.c does not.  Is this intended?  (This issue
>      is independent of the above change.)

The second entry of "gitk lib/areadlink.c" shows the reason for these two
#undefs: Fix a link error in the relocwrapper module. I had verified that
areadlink.c passes only positive sizes to malloc and realloc. The same
verification also holds for areadlinkat.c. So, adding
  #undef malloc
  #undef realloc
in areadlinkat.c would be only a micro-optimization, not a fix of a link
error. Feel free to add it, though, if you find it appropriate.

>  2.  The areadlink-with-size module is obsolescent, now that
>      areadlink, areadlinkat, nd breadlinkat read the link
>      contents into stack space first (unless the link
>      contents' length is 1 KiB or more, which is so rare we
>      needn't worry about it for performance reasons).  I
>      originally did the readlink-with-size business back
>      when the code always did malloc first.  The stack-based
>      performance improvement obsoletes the -with-size
>      performance improvement, so I propose that we remove
>      the areadlink-with-size module and replace all calls
>      with calls to areadlink.
>  3.  Similarly for areadlinkat-with-size.

Sounds perfectly fine.

Bruno

[1] http://git.savannah.gnu.org/gitweb/?p=emacs.git;a=blob;f=src/blockinput.h
[2] http://www.mail-archive.com/pth-users%40gnu.org/msg00516.html
[3] http://www.cplusplus.com/reference/std/memory/allocator/
-- 
In memoriam Rachel Levy <http://en.wikipedia.org/wiki/Rachel_Levy>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]