qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio


From: Daniel P . Berrangé
Subject: Re: libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release)
Date: Thu, 29 Apr 2021 18:27:48 +0100
User-agent: Mutt/2.0.6 (2021-03-06)

On Thu, Apr 29, 2021 at 06:17:32PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote:
> > On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> > > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > > > libvirt originally, and now libnbd, keep a per-thread error message
> > > > > (stored in thread-local storage).  It's a lot nicer than having to
> > > > > pass &errmsg to every function.  You can just write:
> > > > > 
> > > > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > > > >    fprintf (stderr,
> > > > >             "failed to connect to remote server: %s (errno = %d)\n",
> > > > >             nbd_get_error (), nbd_get_errno ());
> > > > >    exit (EXIT_FAILURE);
> > > > >  }
> > > > > 
> > > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > > > 
> > > > > It means you can extend the range of error information available in
> > > > > future.  Also you can return a 'const char *' and the application
> > > > > doesn't have to worry about lifetimes, at least in the common case.
> > > > 
> > > > Thanks for sharing the idea, I think it would work well for libblkio
> > > > too.
> > > > 
> > > > Do you ignore the dlclose(3) memory leak?
> > > 
> > > I believe this mechanism in libnbd ensures there is no leak in the
> > > ordinary shared library (not dlopen/dlclose) case:
> > > 
> > > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> > > 
> > > However I'm not sure what happens in the dlopen case, so I'm going to
> > > test that out now ...
> > 
> > pthread_key destructors are a disaster waiting to happen with
> > dlclose, if the dlclose happens while non-main threads are
> > still running. When those threads exit, they'll run the
> > destructor which points to a function that is no longer
> > resident in memory.
> 
> While libnbd had a memory leak there, now fixed by:
> 
> https://gitlab.com/nbdkit/libnbd/-/commit/026d281c57dd95485cc9bf829918b5efd9e32ddb

So on dlclose(), the errors_free() method will be run by the linker.

The pthread_key_delete method doesn't run the cleanup callbacks, so
your fix here frees the thread local error string in the thread that
called dlclose explicitly.

If any other thread had an error string associated with the thread
local, that will still be leaked.  I don't see any attractive way
to avoid that leak.

IOW, I think the fix is incomplete, but its better than nothing,
and I don't have suggestion to improve it.

> I don't actually think we have the bug you're describing.  We have a
> destructor (errors_free()) which runs on dlclose, which deletes the
> thread-local storage key, so the destructor will not run after the
> library has been unloaded.
> 
> I added a test for this which works fine for me:
> 
> https://gitlab.com/nbdkit/libnbd/-/commit/831d142787aba4f5b638418e02cf7e0f2a051565

Yes, I think your using of ELF destructor function is enough in this
scenario.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]