bug-libtool
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Please help with OSR5 port?


From: Kean Johnston
Subject: Please help with OSR5 port?
Date: Sun, 4 Aug 2002 11:55:38 -0700
User-agent: Mutt/1.3.28i

Hello,

I am trying to get libtool to work propperly (as defined by me :-)) on
SCO OpenServer. Even with the latest version out of CVS, there are
still some make check failures (mdemo-exec, mdemo-inst, tagdemo-exec etc).
On top of that, for 5.0.7, I have dramatically changed the link editor and
RTLD to be the same as on OpenUNIX 8. However, even that systems libtool
configuration is, I believe, a little wrong. So, what I will do here
is describe the characteristics of ld and the RTLD, the compiler options
used to create shared libraries, and how I'd like to see things end up.
Hopefully then, someone on this list with more experience of the twisty
maze of libtool variables can help me set things up correctly. Note that
although I believe the same strategy should apply to OpenUNIX 8, please
lets focus on OpenServer, as thats what I have an immediate need to get
working. I can fuss with OU8 later.

Please also note that things as described here refer to the as-yet-unreleased
SCO OpenServer 5.0.7. But all of these characteristics will be available on
previous releases, via a supplement. Having just said that though, the
scheme I propose wont matter a jot, it doesnt rely on the new RTLD or ld
characteristics. But they are defined here for your reference.

ELF Link Editor (ld) characteristics:
  ld -R path
     Inserts PATH as a DT_RUNPATH entry in either a shared library or an
     ELF executable. PATH can be a colon-separated list of directories.
     See RTLD characteristics below to see how DT_RUNPATH is used.

  ld -h path
     Inserts PATH as an SONAME entry in a shared library. PATH is a single
     *file name*, not a directory. It is used to specify the SONAME of the
     shared library. See RTLD below for how the SONAME is used.

  ld -YP,paths
     Sets the colon-separated list of PATHS to be the defaults library
     search path. This defaults to /usr/ccs/lib:/lib:/usr/lib. This is
     the last list of directories the link editor searches for shared
     or static libraries asked with with the -lLIB options. The environment
     variable LD_LIBRARY_PATH (see below) and various -L options control
     the directories that are searched for before the list specified here.

  ld -YL,path1
  ld -YU,path2
     Replaces the first and second entries in the default library search
     directory list, respectively. Rarely used (documented here for the
     sake of completeness). For example, specifying -YL,/usr/foo would make
     the default list of directories searched /usr/foo:/lib:/usr/lib.
     Specifying -YU,/usr/bar, would make the directory list seached for
     /usr/ccs/lib:/usr/bar:/usr/lib. Specifying both would (predictably)
     set the list to /usr/foo:/usr/bar:/usr/lib. The -YL and -YU options
     are mutually exclusive with the -YP option.

  env LD_RUN_PATH
     When creating an executable, sets the DT_RPATH entry for that executable.
     This can be a colon separated list of directory names. See the RTLD
     discussion on how DT_RPATH is used.

  env LD_LIBRARY_PATH
     This environment variable can contain either one colon separated list of
     paths, or two colon separated list of paths, the two lists being separated
     by a semi-colon. The two lists are the pre-command and post-command lists,
     respectively. When searching for libX.{a,so}, the link editor will search
     in pre-command first, then any directories specified with -L flags
     (the so-called command-path), then the directories in the post-command
     list. If only one list is specified (i.e. there is no semi-colon in the
     variable), then only the pre-command list is set.

  env LD_ROOT
     This is a colon separated list of alternate root directories to use
     when searching for shared library dependents. For example, suppose
     libfoo.so.1 has a NEEDED shared library called /usr/lib/libbar.so.2.
     By default, the link editor will only look for that file, exactly as
     specified (note that the path is absolute. Relative paths or NEEDED
     entries with no path and just a shared library name are treated
     differently). By setting LD_ROOT, you can give a list of alternate
     root directories to search. The file must still be in the directory
     /usr/lib/libbar.so.2, relative to the root directory specified. This
     is used mainly during compilation of the OS, to set the root directory
     to search for the build tree's root install directory, so that the
     just-built libraries are used from this location rather than the running
     systems libraries. But it is useful for other purposes too.

     For example, suppose you set LD_ROOT=/u/tmp/tree1:/u/tmp/tree2. A shared
     library you were trying to link against has a NEEDED of /usr/lib/libbar.so.
     Because the dependent library has / characters in it, the link editor
     will only search the root directories specified. It will look for the
     files /u/tmp/tree1/usr/lib/libbar.so and /u/tmp/tree2/usr/lib/libbar.so.
     If it can't find either of those two files, it aborts with an error.

  How libraries are resolved
     To illustrate how ld resolves libraries, consider the following ld
     command line (it is intentionally abreviated: there would normally
     be references to other objects, such as the C RunTime (crt) startup
     files and the like):

        ld -dy -o a.out file1.o -llib1 -L. -llib2 -L../foo -llib3 -lc

     For the purposes of this discussion, lets say that lib2 is really
     liblib2.so, a shared library, and it depends on libbar.so.1. lib3
     is also a shared library, with no dependents, an an SONAME of
     /usr/foo/liblib3.so.2.

     For dependent names without a / charater in them, the link editor looks
     in first in the path specified by LD_RUN_PATH. Next, it searches the
     pre-command list, then the list of directories specified with -L, then
     the post-command list, then in any directory contained in a DT_RUNPATH
     entry of the *CURRENT INPUT FILE*.

     So, with this understanding, here is how ld would try and find everything.
     First, it would look for liblib1.a and liblib1.so in the normal places,
     in this case /usr/ccs/lib, /lib and /usr/lib. If it didn't find it, it
     would abort right now. Next, it tries to find liblib2.a or liblib2.so.
     It will look first in the current directory, becuase of the -L. If it
     was not found here, it would search the standard list of paths. Lets
     assume it found it in the current directory. In processing liblib2.so,
     it sees that it has a NEEDED value of libbar.so.1. So it needs to make
     sure it can find that file. It will look in the directory specified
     by LD_RUN_PATH, if set, first. Next, it will look in the current
     directory, and if it wasn't found there, it would look in the standard
     locations. Suppose, however, that liblib2.so had been compiled with
     the flags -R /u/tmp/myshlib:/usr/local/lib. This would mean that
     liblib2.so would have a DT_RUNPATH entry set in it, with those two
     paths. In this case, the link editor would look first in /u/tmp/myshlib,
     then /usr/local/lib, then the current directroy, then the standard
     locations. If the file could not be found, the link editor would abort.

     Next, ld would look for liblib3.a or liblib3.so. For this library, it
     would search first in the current directory (the first -L.), then in
     the directory ../foo, becuase of the second -L option, and then in
     the standard list of locations. If it was not found here, the editor
     would abort with an error.

     Last, it looks for a library called libc.a or libc.so, in the same
     set of locations as liblib3.so above. Most likely, it will find
     the one in /usr/lib.

     Assuming all of the libraries were found, the executable is created,
     and it has a list of NEEDED entries placed in it. In this case, the
     NEEDED list comes from the SONAME entries in all of the shared
     libraries we linked against. If an explicit SONAME was not set with
     -h when the the shared library was created, then the name of the file
     is inserted into the executable as its DT_NEEDED entry.

Run-time link-editor (RTLD) Characteristics:
  The new RTLD is one of the bigger changes in release 5.0.7. This is a
  completely new RTLD, and is based on the current OpenUNIX 8 RTLD. This
  section describes the interaction between link editor options and the
  RTLD. To understand how this all works, it helps to know the ELF entries
  in an executable and shared library that the RTLD looks at. Here is a
  description of each of those entries. After these descriptions, I have
  listed a few examples to show the semantics in operation.

  DT_RUNPATH - set the library search path. For executables, this sets the
     list of paths to search for DT_NEEDED libraries that do not have absolute
     paths. For a shared library, sets the list of paths to search for its
     DT_NEEDED dependent libraries. This is set by the ld option -R. It can
     be a colon separated list of paths. It is important to note that
     DT_RUNPATH entries are used to resolve the current object's dependencies
     only. Thus, if the RTLD was loading libfoo.so.1, then any DT_RUNPATH
     entries in that library would only be used to find any DT_NEEDED entries
     listed for that library.

  DT_NEEDED - specifies the name of a dependent shared library. This same
     entry is used the same way in both executables and shared libraries. It
     is retrieved from the DT_SONAME entry for a shared library at link
     edit time. If a shared library does not have an explicit DT_SONAME entry,
     one is created by the link editor, based on the input file name.

  DT_SONAME - the name of a shared library. This is not necessarily the same
     name as you would link against. Frequently, this name has a shared library
     version, for example, libfoo.so.1. But the library you link against is
     simply libfoo.so. The latter is usually a symbolic link to the former. If
     this name is an absolute path, then the shared library is always searched
     for at the specified location. If it is just a name, with no path, it is
     searched for in several locations, as described below.

  DT_RPATH - sets the library search path. Similar in intent to DT_RUNPATH
     but set by the value of LD_RUN_PATH at link edit time. Only one of either
     DT_RPATH or DT_RUNPATH is set by the link editor. It applies only to
     executables (shared libraries only use DT_RUNPATH), and if set, will
     render any -R options useless. There is almost no need for this option to
     ever be used. Use -R instead. However, if you managed to use some other
     link editor (good luck) to create the binary, and it has both DT_RUNPATH
     and DT_RPATH set, the RTLD will use the DT_RUNPATH entry and ignore the
     DT_RPATH one.

  There are also a few environment variables that affect the way in which
  the RTLD operates. Breifly, they are:

  LD_DEBUG - a bit pattern of things to debug. LD_DEBUG=16 gives you
     useful information about paths being used to resolve shared libraries.
     Setting it to 47 gets you more information than you can shake a stick at :)

  LD_INSERT - a list of library names to load before certain specified
     libraries. This is a semi-colon separated list of insert:before pairs.
     The RTLD will insert the library INSERT before it loads the library
     BEFORE. Separate multiple insert:before pairs with a semi-colon.

  LD_LIBRARY_PATH - sets the library search path. This can be used at run
     time to help resolve the location of shared libraries that don't have
     absolute paths. This list is searched before any DT_RPATH or DT_RUNPATH
     locations.

  LD_BIND_NOW - if set to 1, forces the RTLD to resolve all references now
     at load time, rather than doing lazy binding of symbols as and when
     they are needed. This is a great way of making sure that all of the
     shared libraries your binary needs actually load correctly, and that
     you do not have any unresolved dependencies.

  How the RTLD finds shared libraries at run-time
     When you run an executable, after doing some internal setup for getting
     all of the symbols the RTLD itself needs, one of the first things the
     RTLD does is check the environment. All of the environment variables
     above are checked and stored, as well as a few others that are beyond
     the scope of this discusion. Next it examines the executable file
     itself, and gathers some key pieces of information from its dynamic
     section. The first is DT_RUNPATH. If it finds such an entry in the
     dynamic section, it sets a flag indicating it was found and stores
     its value. If no DT_RUNPATH entry was found, and a DT_RPATH entry was,
     then the value of DT_RPATH is stored.

     As the RTLD nears the end of its setup phase, we have three pieces of
     information of note: LD_LIBRARY_PATH, DT_RUNPATH and DT_RPATH. For right
     now, DT_RUNPATH is not used. The RTLD sets up three path lists now,
     called $ld_libpath (based on the LD_LIBRARY_PATH variable), $dt_rpath,
     based on the DT_RPATH entry (if any) in the executable, and $def_path,
     which is the system default location, /usr/lib. It then calls the guts
     of the RTLD (the actual function _rtld, which loads the executable and
     also deals with dlopen calls at runtime). This functions first job is
     to resolve any DT_NEEDED entries and load them. After each DT_NEEDED
     entry is loaded, the RTLD scans that entry for any of its own DT_NEEDED
     entries, and loads them, until such time as all dependent libraries have
     been loaded. This results in a breadth-first loading of dependencies.
     But it is how DT_NEEDED is processed that we are interested in for the
     purposes of libtool.

     There is some magic that is performed for each DT_NEEDED entry, if
     LD_INSERT is set. This will instruct the RTLD to load in a replacement
     library before loading in the actual desired DT_NEEDED library. Note
     that currently, however, this functionality is disabled for security
     purposes, but it can be enabled.

     For each DT_NEEDED entry, the RTLD first does what is known as ORIGIN
     checking. ORIGIN is rather cool, but I wont go into it right here. If
     you care, read at the very end of this mail for how $ORIGIN is used.
     For the, most part, $ORIGIN is not used, so this is what the RTLD does.
     If we have already loaded this object, we simply return from the RTLD.
     If the  DT_NEEDED contains any / characters, then the RTLD tries to
     load the specified entry exactly as shown. This even applies to relative
     path names. Thus, if DT_NEEDED was mydir/libmylib.so.1, then that is
     exactly the file name that the RTLD will search for. No paths are applied.
     As you can see, relative names with paths are not very useful. But
     absolute paths are extremely useful. If DT_NEEDED was /usr/lib/libbar.so,
     then that is always the only name the RTLD will search for.

     However, if DT_NEEDED is just a shared library name, then we have to
     search for the dependent. In this case, the following algorithm is
     used. If DT_RUNPATH was not specified, and $dt_rpath is not null (i.e
     LD_RUN_PATH was specified for the executable), then search in $dt_rpath.
     If it was not found there, and $ld_libpath is set (LD_LIBRARY_PATH), then
     search in that list of directories. If there is a DT_RUNPATH entry set
     in the current object's *PARENT*, search that list of directories. If
     still not found, search the default location. If after all this it still
     can't be found, abort with an error.

     This shows an interesting thing. The DT_RUNPATH (ld -R) is set to control
     where any dependent libraries are found. It is not used by the link editor
     itself to help locate libraries.

So, there you have it, that is how our RTLD and Link Editor work. Now, given
all of that, here is how I would like to see things work. I believe that
simplest is best, and I bias the simplicity towards the user, not the
developer. With that in mind, I submit that we completely ignore things
like LD_LIBRARY_PATH, LD_RUN_PATH and the ld -R flag. They are all red
herrings that make you think you have things under control. You don't.
I believe the best way of dealing with shared libraries is to use fully
qualified DT_SONAME entries. That way, there is never any ambiguity. It
also has considerable security implications.

By not using fully qualified directory entries, on some OSes (and older
versions of OpenServer were thus afflicted), is is easy to use the
LD_LIBRARY_PATH variable to subvert a shared library in a setuid executable.
All it takes is ONE on the system, and the user can get elevated priveliges.
All a hacker needs to do is create their own libBAD.so file, have one of
the functions they know are called in it copy /bin/ksh and make it setuid,
then set LD_LIBRARY_PATH to make sure their library gets loaded first,
and voila, instant root. If other OSes don't prevent LD_LIBRARY_PATH from
being used by setuid binaries, they too are vulnerable to this kind of
thing. By using absolute DT_SONAME's, you avoid this altogether. I also
suspect, although I can't swear to it, that it may eliminate the need
for running ldconfig on Linux and other systems. It has been years since
I looked at the Linux RTLD and ldconfig, so I dont know if it means "this
is a directory that is ALLOWED to contain shared libraries" or if it means
"search this directory if I am not given an absolute path". If the latter,
then specifying absolute paths eliminates the need for this step.

So, the easiest way to do this is to have the hard-coded file name be
inserted at link time. But we need the library to work in the development
tree as well, so we will need to relink when we install. Again, I suspect
that other OSes have RTLD's and link editors that behave very similarly,
if not identically, so they would need to be changed as well. Something to
consider. This then eliminates the need for touching the executable in
any way, all we ever need to do is put the full path name in the shared
libraries.

If we use this scheme, there is also no need for wrapper scripts around
executables created by the build, as there is nothing to set in the
environment. The RTLD will always be able to find all shared libraries.
It will also be able to find all dependencies (for our RTLD at any rate).
Again, I suspect that other systems would benefit from this. When we do
a libtool --install, we relink the shared libraries with the destination
hard paths, and then relink the executable. I would suggest using different
names (I think this is done already), perhaps prefixing the run-in-source-tree
libraries with lt-.

So, given thats what I want, what is the right combination of Swahili voodoo
incantations and belly rubbing I need to do to get it to behave this way
on OSR5? I played around with it some, but couldn't come up with the right
set of variables to do what I want. I am also pressed for time, so wading
through all that shell code was just too time consuming. I really hope
that someone can help me, and fairly quickly. If I have left out any
pertinent information, please let me know. Oh, just one more thing, this
scheme would also work on all previous releases of OpenServer, thus eliminating
any dependency on the new RTLD characteristics, such as DT_RUNPATH.

I thank you all for your time and effort. Please send any replies to
address@hidden (hopefully I got mutt ot set reply-to correctly). I am
sending this from my home account which I read less frequently.

\Kean/

PS. I said above I would discuss $ORIGIN. Here is how it works. If the
DT_SONAME entry contains the string $ORIGIN or ${ORIGIN}, what follows
is assumed to be a relative path to a shared library name. This is
relative to the location of the parent object (be it executable or
shared library, it doesn't matter). For example, if an executable
is installed in /usr/bin, you could have a $ORIGIN../lib to make the
RTLD look in /usr/bin/../lib for the shared library. If it is installed
in say, /usr/libexec/myprog, and the shared library is installed in
/usr/libexec/myprog/modules, you could use ${ORIGIN}modules. Neat eh?
This is very rarely used in practice. I suspect, because few people know
about it.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]