[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Please help with OSR5 port?
From: |
Kean Johnston |
Subject: |
Please help with OSR5 port? |
Date: |
Sun, 4 Aug 2002 11:55:38 -0700 |
User-agent: |
Mutt/1.3.28i |
Hello,
I am trying to get libtool to work propperly (as defined by me :-)) on
SCO OpenServer. Even with the latest version out of CVS, there are
still some make check failures (mdemo-exec, mdemo-inst, tagdemo-exec etc).
On top of that, for 5.0.7, I have dramatically changed the link editor and
RTLD to be the same as on OpenUNIX 8. However, even that systems libtool
configuration is, I believe, a little wrong. So, what I will do here
is describe the characteristics of ld and the RTLD, the compiler options
used to create shared libraries, and how I'd like to see things end up.
Hopefully then, someone on this list with more experience of the twisty
maze of libtool variables can help me set things up correctly. Note that
although I believe the same strategy should apply to OpenUNIX 8, please
lets focus on OpenServer, as thats what I have an immediate need to get
working. I can fuss with OU8 later.
Please also note that things as described here refer to the as-yet-unreleased
SCO OpenServer 5.0.7. But all of these characteristics will be available on
previous releases, via a supplement. Having just said that though, the
scheme I propose wont matter a jot, it doesnt rely on the new RTLD or ld
characteristics. But they are defined here for your reference.
ELF Link Editor (ld) characteristics:
ld -R path
Inserts PATH as a DT_RUNPATH entry in either a shared library or an
ELF executable. PATH can be a colon-separated list of directories.
See RTLD characteristics below to see how DT_RUNPATH is used.
ld -h path
Inserts PATH as an SONAME entry in a shared library. PATH is a single
*file name*, not a directory. It is used to specify the SONAME of the
shared library. See RTLD below for how the SONAME is used.
ld -YP,paths
Sets the colon-separated list of PATHS to be the defaults library
search path. This defaults to /usr/ccs/lib:/lib:/usr/lib. This is
the last list of directories the link editor searches for shared
or static libraries asked with with the -lLIB options. The environment
variable LD_LIBRARY_PATH (see below) and various -L options control
the directories that are searched for before the list specified here.
ld -YL,path1
ld -YU,path2
Replaces the first and second entries in the default library search
directory list, respectively. Rarely used (documented here for the
sake of completeness). For example, specifying -YL,/usr/foo would make
the default list of directories searched /usr/foo:/lib:/usr/lib.
Specifying -YU,/usr/bar, would make the directory list seached for
/usr/ccs/lib:/usr/bar:/usr/lib. Specifying both would (predictably)
set the list to /usr/foo:/usr/bar:/usr/lib. The -YL and -YU options
are mutually exclusive with the -YP option.
env LD_RUN_PATH
When creating an executable, sets the DT_RPATH entry for that executable.
This can be a colon separated list of directory names. See the RTLD
discussion on how DT_RPATH is used.
env LD_LIBRARY_PATH
This environment variable can contain either one colon separated list of
paths, or two colon separated list of paths, the two lists being separated
by a semi-colon. The two lists are the pre-command and post-command lists,
respectively. When searching for libX.{a,so}, the link editor will search
in pre-command first, then any directories specified with -L flags
(the so-called command-path), then the directories in the post-command
list. If only one list is specified (i.e. there is no semi-colon in the
variable), then only the pre-command list is set.
env LD_ROOT
This is a colon separated list of alternate root directories to use
when searching for shared library dependents. For example, suppose
libfoo.so.1 has a NEEDED shared library called /usr/lib/libbar.so.2.
By default, the link editor will only look for that file, exactly as
specified (note that the path is absolute. Relative paths or NEEDED
entries with no path and just a shared library name are treated
differently). By setting LD_ROOT, you can give a list of alternate
root directories to search. The file must still be in the directory
/usr/lib/libbar.so.2, relative to the root directory specified. This
is used mainly during compilation of the OS, to set the root directory
to search for the build tree's root install directory, so that the
just-built libraries are used from this location rather than the running
systems libraries. But it is useful for other purposes too.
For example, suppose you set LD_ROOT=/u/tmp/tree1:/u/tmp/tree2. A shared
library you were trying to link against has a NEEDED of /usr/lib/libbar.so.
Because the dependent library has / characters in it, the link editor
will only search the root directories specified. It will look for the
files /u/tmp/tree1/usr/lib/libbar.so and /u/tmp/tree2/usr/lib/libbar.so.
If it can't find either of those two files, it aborts with an error.
How libraries are resolved
To illustrate how ld resolves libraries, consider the following ld
command line (it is intentionally abreviated: there would normally
be references to other objects, such as the C RunTime (crt) startup
files and the like):
ld -dy -o a.out file1.o -llib1 -L. -llib2 -L../foo -llib3 -lc
For the purposes of this discussion, lets say that lib2 is really
liblib2.so, a shared library, and it depends on libbar.so.1. lib3
is also a shared library, with no dependents, an an SONAME of
/usr/foo/liblib3.so.2.
For dependent names without a / charater in them, the link editor looks
in first in the path specified by LD_RUN_PATH. Next, it searches the
pre-command list, then the list of directories specified with -L, then
the post-command list, then in any directory contained in a DT_RUNPATH
entry of the *CURRENT INPUT FILE*.
So, with this understanding, here is how ld would try and find everything.
First, it would look for liblib1.a and liblib1.so in the normal places,
in this case /usr/ccs/lib, /lib and /usr/lib. If it didn't find it, it
would abort right now. Next, it tries to find liblib2.a or liblib2.so.
It will look first in the current directory, becuase of the -L. If it
was not found here, it would search the standard list of paths. Lets
assume it found it in the current directory. In processing liblib2.so,
it sees that it has a NEEDED value of libbar.so.1. So it needs to make
sure it can find that file. It will look in the directory specified
by LD_RUN_PATH, if set, first. Next, it will look in the current
directory, and if it wasn't found there, it would look in the standard
locations. Suppose, however, that liblib2.so had been compiled with
the flags -R /u/tmp/myshlib:/usr/local/lib. This would mean that
liblib2.so would have a DT_RUNPATH entry set in it, with those two
paths. In this case, the link editor would look first in /u/tmp/myshlib,
then /usr/local/lib, then the current directroy, then the standard
locations. If the file could not be found, the link editor would abort.
Next, ld would look for liblib3.a or liblib3.so. For this library, it
would search first in the current directory (the first -L.), then in
the directory ../foo, becuase of the second -L option, and then in
the standard list of locations. If it was not found here, the editor
would abort with an error.
Last, it looks for a library called libc.a or libc.so, in the same
set of locations as liblib3.so above. Most likely, it will find
the one in /usr/lib.
Assuming all of the libraries were found, the executable is created,
and it has a list of NEEDED entries placed in it. In this case, the
NEEDED list comes from the SONAME entries in all of the shared
libraries we linked against. If an explicit SONAME was not set with
-h when the the shared library was created, then the name of the file
is inserted into the executable as its DT_NEEDED entry.
Run-time link-editor (RTLD) Characteristics:
The new RTLD is one of the bigger changes in release 5.0.7. This is a
completely new RTLD, and is based on the current OpenUNIX 8 RTLD. This
section describes the interaction between link editor options and the
RTLD. To understand how this all works, it helps to know the ELF entries
in an executable and shared library that the RTLD looks at. Here is a
description of each of those entries. After these descriptions, I have
listed a few examples to show the semantics in operation.
DT_RUNPATH - set the library search path. For executables, this sets the
list of paths to search for DT_NEEDED libraries that do not have absolute
paths. For a shared library, sets the list of paths to search for its
DT_NEEDED dependent libraries. This is set by the ld option -R. It can
be a colon separated list of paths. It is important to note that
DT_RUNPATH entries are used to resolve the current object's dependencies
only. Thus, if the RTLD was loading libfoo.so.1, then any DT_RUNPATH
entries in that library would only be used to find any DT_NEEDED entries
listed for that library.
DT_NEEDED - specifies the name of a dependent shared library. This same
entry is used the same way in both executables and shared libraries. It
is retrieved from the DT_SONAME entry for a shared library at link
edit time. If a shared library does not have an explicit DT_SONAME entry,
one is created by the link editor, based on the input file name.
DT_SONAME - the name of a shared library. This is not necessarily the same
name as you would link against. Frequently, this name has a shared library
version, for example, libfoo.so.1. But the library you link against is
simply libfoo.so. The latter is usually a symbolic link to the former. If
this name is an absolute path, then the shared library is always searched
for at the specified location. If it is just a name, with no path, it is
searched for in several locations, as described below.
DT_RPATH - sets the library search path. Similar in intent to DT_RUNPATH
but set by the value of LD_RUN_PATH at link edit time. Only one of either
DT_RPATH or DT_RUNPATH is set by the link editor. It applies only to
executables (shared libraries only use DT_RUNPATH), and if set, will
render any -R options useless. There is almost no need for this option to
ever be used. Use -R instead. However, if you managed to use some other
link editor (good luck) to create the binary, and it has both DT_RUNPATH
and DT_RPATH set, the RTLD will use the DT_RUNPATH entry and ignore the
DT_RPATH one.
There are also a few environment variables that affect the way in which
the RTLD operates. Breifly, they are:
LD_DEBUG - a bit pattern of things to debug. LD_DEBUG=16 gives you
useful information about paths being used to resolve shared libraries.
Setting it to 47 gets you more information than you can shake a stick at :)
LD_INSERT - a list of library names to load before certain specified
libraries. This is a semi-colon separated list of insert:before pairs.
The RTLD will insert the library INSERT before it loads the library
BEFORE. Separate multiple insert:before pairs with a semi-colon.
LD_LIBRARY_PATH - sets the library search path. This can be used at run
time to help resolve the location of shared libraries that don't have
absolute paths. This list is searched before any DT_RPATH or DT_RUNPATH
locations.
LD_BIND_NOW - if set to 1, forces the RTLD to resolve all references now
at load time, rather than doing lazy binding of symbols as and when
they are needed. This is a great way of making sure that all of the
shared libraries your binary needs actually load correctly, and that
you do not have any unresolved dependencies.
How the RTLD finds shared libraries at run-time
When you run an executable, after doing some internal setup for getting
all of the symbols the RTLD itself needs, one of the first things the
RTLD does is check the environment. All of the environment variables
above are checked and stored, as well as a few others that are beyond
the scope of this discusion. Next it examines the executable file
itself, and gathers some key pieces of information from its dynamic
section. The first is DT_RUNPATH. If it finds such an entry in the
dynamic section, it sets a flag indicating it was found and stores
its value. If no DT_RUNPATH entry was found, and a DT_RPATH entry was,
then the value of DT_RPATH is stored.
As the RTLD nears the end of its setup phase, we have three pieces of
information of note: LD_LIBRARY_PATH, DT_RUNPATH and DT_RPATH. For right
now, DT_RUNPATH is not used. The RTLD sets up three path lists now,
called $ld_libpath (based on the LD_LIBRARY_PATH variable), $dt_rpath,
based on the DT_RPATH entry (if any) in the executable, and $def_path,
which is the system default location, /usr/lib. It then calls the guts
of the RTLD (the actual function _rtld, which loads the executable and
also deals with dlopen calls at runtime). This functions first job is
to resolve any DT_NEEDED entries and load them. After each DT_NEEDED
entry is loaded, the RTLD scans that entry for any of its own DT_NEEDED
entries, and loads them, until such time as all dependent libraries have
been loaded. This results in a breadth-first loading of dependencies.
But it is how DT_NEEDED is processed that we are interested in for the
purposes of libtool.
There is some magic that is performed for each DT_NEEDED entry, if
LD_INSERT is set. This will instruct the RTLD to load in a replacement
library before loading in the actual desired DT_NEEDED library. Note
that currently, however, this functionality is disabled for security
purposes, but it can be enabled.
For each DT_NEEDED entry, the RTLD first does what is known as ORIGIN
checking. ORIGIN is rather cool, but I wont go into it right here. If
you care, read at the very end of this mail for how $ORIGIN is used.
For the, most part, $ORIGIN is not used, so this is what the RTLD does.
If we have already loaded this object, we simply return from the RTLD.
If the DT_NEEDED contains any / characters, then the RTLD tries to
load the specified entry exactly as shown. This even applies to relative
path names. Thus, if DT_NEEDED was mydir/libmylib.so.1, then that is
exactly the file name that the RTLD will search for. No paths are applied.
As you can see, relative names with paths are not very useful. But
absolute paths are extremely useful. If DT_NEEDED was /usr/lib/libbar.so,
then that is always the only name the RTLD will search for.
However, if DT_NEEDED is just a shared library name, then we have to
search for the dependent. In this case, the following algorithm is
used. If DT_RUNPATH was not specified, and $dt_rpath is not null (i.e
LD_RUN_PATH was specified for the executable), then search in $dt_rpath.
If it was not found there, and $ld_libpath is set (LD_LIBRARY_PATH), then
search in that list of directories. If there is a DT_RUNPATH entry set
in the current object's *PARENT*, search that list of directories. If
still not found, search the default location. If after all this it still
can't be found, abort with an error.
This shows an interesting thing. The DT_RUNPATH (ld -R) is set to control
where any dependent libraries are found. It is not used by the link editor
itself to help locate libraries.
So, there you have it, that is how our RTLD and Link Editor work. Now, given
all of that, here is how I would like to see things work. I believe that
simplest is best, and I bias the simplicity towards the user, not the
developer. With that in mind, I submit that we completely ignore things
like LD_LIBRARY_PATH, LD_RUN_PATH and the ld -R flag. They are all red
herrings that make you think you have things under control. You don't.
I believe the best way of dealing with shared libraries is to use fully
qualified DT_SONAME entries. That way, there is never any ambiguity. It
also has considerable security implications.
By not using fully qualified directory entries, on some OSes (and older
versions of OpenServer were thus afflicted), is is easy to use the
LD_LIBRARY_PATH variable to subvert a shared library in a setuid executable.
All it takes is ONE on the system, and the user can get elevated priveliges.
All a hacker needs to do is create their own libBAD.so file, have one of
the functions they know are called in it copy /bin/ksh and make it setuid,
then set LD_LIBRARY_PATH to make sure their library gets loaded first,
and voila, instant root. If other OSes don't prevent LD_LIBRARY_PATH from
being used by setuid binaries, they too are vulnerable to this kind of
thing. By using absolute DT_SONAME's, you avoid this altogether. I also
suspect, although I can't swear to it, that it may eliminate the need
for running ldconfig on Linux and other systems. It has been years since
I looked at the Linux RTLD and ldconfig, so I dont know if it means "this
is a directory that is ALLOWED to contain shared libraries" or if it means
"search this directory if I am not given an absolute path". If the latter,
then specifying absolute paths eliminates the need for this step.
So, the easiest way to do this is to have the hard-coded file name be
inserted at link time. But we need the library to work in the development
tree as well, so we will need to relink when we install. Again, I suspect
that other OSes have RTLD's and link editors that behave very similarly,
if not identically, so they would need to be changed as well. Something to
consider. This then eliminates the need for touching the executable in
any way, all we ever need to do is put the full path name in the shared
libraries.
If we use this scheme, there is also no need for wrapper scripts around
executables created by the build, as there is nothing to set in the
environment. The RTLD will always be able to find all shared libraries.
It will also be able to find all dependencies (for our RTLD at any rate).
Again, I suspect that other systems would benefit from this. When we do
a libtool --install, we relink the shared libraries with the destination
hard paths, and then relink the executable. I would suggest using different
names (I think this is done already), perhaps prefixing the run-in-source-tree
libraries with lt-.
So, given thats what I want, what is the right combination of Swahili voodoo
incantations and belly rubbing I need to do to get it to behave this way
on OSR5? I played around with it some, but couldn't come up with the right
set of variables to do what I want. I am also pressed for time, so wading
through all that shell code was just too time consuming. I really hope
that someone can help me, and fairly quickly. If I have left out any
pertinent information, please let me know. Oh, just one more thing, this
scheme would also work on all previous releases of OpenServer, thus eliminating
any dependency on the new RTLD characteristics, such as DT_RUNPATH.
I thank you all for your time and effort. Please send any replies to
address@hidden (hopefully I got mutt ot set reply-to correctly). I am
sending this from my home account which I read less frequently.
\Kean/
PS. I said above I would discuss $ORIGIN. Here is how it works. If the
DT_SONAME entry contains the string $ORIGIN or ${ORIGIN}, what follows
is assumed to be a relative path to a shared library name. This is
relative to the location of the parent object (be it executable or
shared library, it doesn't matter). For example, if an executable
is installed in /usr/bin, you could have a $ORIGIN../lib to make the
RTLD look in /usr/bin/../lib for the shared library. If it is installed
in say, /usr/libexec/myprog, and the shared library is installed in
/usr/libexec/myprog/modules, you could use ${ORIGIN}modules. Neat eh?
This is very rarely used in practice. I suspect, because few people know
about it.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Please help with OSR5 port?,
Kean Johnston <=