[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using systemd-249's libnss_systemd.so.2 triggers a crash in bash-5.1

From: Dominique Martinet
Subject: Re: Using systemd-249's libnss_systemd.so.2 triggers a crash in bash-5.1's malloc.c
Date: Tue, 5 Oct 2021 10:44:00 +0900

Chet Ramey wrote on Mon, Oct 04, 2021 at 09:23:11PM -0400:
> >   - I could reproduce the same as Julien, with -DDISABLE_MALLOC_WRAPPERS
> > the crash still happens when bash is run directly but nothing complains
> > in valgrind.
> I assume you mean using systemd. Has anyone tried running a bash linked to
> the systemd library that provides the getpw functions, but not as a systemd
> unit? You could then run it in a debugger if it crashes, for instance.

I'm running busybox sh in a unit (which starts properly), then
interactively test things from there.

Running in gdb does fail the same way as running normally, so I've also
been looking at that a bit, but nothing obvious poped up.
I'd like to trace back which allocation corresponds to the failing one,
and break from there next time.

If you have a nixos system, not necessarily on master, this should be
enough to reproduce:
$ git clone -b master git://github.com/NixOS/nixpkgs.git
$ cd nixpkgs
$ nix-shell -A systemd
^ this will fetch systemd 249.4
  $ echo $out
  $ exit
$ nix-shell -A bash
^ likewise with bash 5.1-p8, feel free to prepare it otherwise
  $ echo $out
  $ exit
$ su
# systemd-run --pipe -p DynamicUser=1 -p BindPaths=/etc -p BindPaths=/nix -p 
BindPaths=/run -p MountAPIVFS=1 -p RootDirectory=/tmp $(readlink -e $(which 
[I have no name!@odin:/]$ 
/nix/store/qfb4j7w2fjjq953nd9xncz5mymj5n0kb-bash-5.1-p8/bin/bash --norc -c 
'echo ~'
malloc: unknown:0: assertion botched
free: start and end chunk sizes differ

This would probably work with a non-nixos system, installing only the
nix command, but I didn't try.

> > This could mean that systemd is overflowing bash malloc safeguards as
> > you pointed out (I just don't understand why it wouldn't overflow with
> > internal malloc), but it could also mean that the memory has been
> > allocated somewhere else (e.g. libc's malloc) and freed by bash malloc.
> I have a tough time with that one. If the bash free/realloc get memory that
> the bash malloc hasn't allocated, you're going to fail several sanity tests
> before you get to the point of checking for overflow.

I was writing that with the assumption that bash would fail the check by
reading some uninialized memory after the allocated buffer, but you're
right that other sanity checks would fail before.. and valgrind would
have spotted that anyway, it definitely wasn't a good idea.

> > nss systemd has started using reallocarray() since v247 and that is not
> > tracked by bash, I would think that's a good candidate?
> I can't see how? reallocarray() is not a memory allocation primitive. It's
> going to call malloc/realloc to do its work (it's essentially just a call
> to realloc(mem, nmemb * size)). Those will eventually call the bash
> malloc/realloc/free.

Hm, that won't necessary work with LTO though.
If they call reallocarray, which is in libc, LTO means reallocarray can
call libc's realloc without going through symbol interposition.
(That's a discussion that came up on fedora-devel mailing list when
talking about LTO, and breaking LD_PRELOAD no longer overloading calls
internal to a lib)

I assume nixos does not compile glibc with LTO, so in practice you are
correct here - it should fall back to bash's realloc.
(that explains my question about earlier versions of nss systemd


reply via email to

[Prev in Thread] Current Thread [Next in Thread]