automake
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Partial linking with _RELOCATABLES - Proposed enhancement


From: Marc Alff
Subject: Re: Partial linking with _RELOCATABLES - Proposed enhancement
Date: Thu, 06 Apr 2006 08:07:53 -0700
User-agent: Mozilla Thunderbird 1.0.7 (X11/20060327)


Hi Ralf, Alexandre, and all,


Ralf Wildenhues wrote:

- C++ code may need to use "ld -Ur" in some contexts (as the last
 partial link);  I haven't grasped all the details about when this is
 necessary.  Search for -Ur in `info ld Options' (several instances).


So it seems for C++ at least the user may need to specify when that last
partial link is?  What I don't understand is: why do I have to partially
link a C++ program? Why can't I just do a few partial links, but then a plain old regular link of the program output?

(I haven't checked whether compilers/linkers other than GCC/GNU binutils
have similar requirements, yet.)

Yes, I am aware of that option too, and still can not figure the **real** technical facts behind it.

For what it's worth, I am using C++ (with real classes, inheritance and all, not just extern "C"), compiling, linking several stages with ld -r, and doing a final link without trouble,
and without using "-Ur" at any stage.

Reading the man page, it seems that one can't use this option twice, but nothing says one has to use it anyway :) I can only speculate that the issue has to do with code implicitly generated by the compiler (default constructor / destructor, virtual tables, ...) which by definition the compiler can not "place" in one object rather than another, and ends up duplicated in different object files (but again, a regular build should have the same issues).


MA> As a result, a build chain can be longer :

MA> *.c, *.cpp --> *.o --> * glued.o --> * super-glued.o --> lib*.a, lib*.so MA> or binaries

ADL> Will your proposal allow the creation of *.a, *.so, and
binaries out of relocatable objects?  I'm wondering, because the
*.o should be compiled differently if they are meant to be part
of a shared library, and the _RELOCATABLES syntax doesn't
indicate where the object will be used.

If you want to get libtool-for-shared-libraries into play, then the
sensible thing IMHO would be to have libtool objects named .lo which
themselves are created by two partial links: the non-PIC *.o go into
output.o, the PIC ones [._]libs/*.o go into [._]libs/output.o.
Agreed, see :

http://lists.gnu.org/archive/html/automake/2006-03/msg00086.html

Yes, Libtool does not do this ATM.  This is a bug.  I don't mind
breaking compatibility here, as that hasn't worked well for a long time.

Marc, to make this very clear to you: At the moment, this:

 $LIBTOOL --mode=link -o glued.lo input.lo

translates to
 ld -r -o glued.lo .libs/input.o

but that is not consistent, and comes from a very old libtool age when
the PIC objects were named object.lo and not .libs/object.o!  And it
will cause subsequent links using glued.lo to fail.  What should happen
instead is this:
 ld -r -o .libs/glued.o .libs/input.o
 ld -r -o glued.o input.o
 creating libtool objects glued.lo

and then "glued.lo" will be a short script just as input.lo is
(with the obvious skipping of steps when either PIC or non-PIC
objects are disabled).


I just ran into that in my testing.
Thanks for spelling that so clearly BTW, it helped me a lot to figure out what was going on.

Maybe Marc would also like to work around nonlinear scaling wrt. the
number of objects in linker (please correct me here if I'm wrong).

The first time I saw "ld -r" used, performances (and the linker crawling to a dead stop on a regular link) was the main concern, so this is how I learned about "ld -r" in the first place.

For my problem today, I am less concerned with link time rather than controling the content, i.e. by linking with "ld -r" myself objects **I** know the application will need,
even if **the linker** does not have a way to find it's needed.

To give you a bit of context, this is what's going on (simplified) :

A CORBA IDL compiler takes "foo.idl" and generates :
- foo.h, foo.cpp, that describes a C++ interface to the CORBA object "foo",
- foo_stub.h, foo_stub.cpp (proxies on the client)
- foo_poa.h, foo_poa.cpp (skeletons on the server)

An application will code against foo.h (the interface),
but will not (and should not) make any references to foo_stub.cpp,
which may -- or may not -- be there.

In CORBA, you never call a constructor, or a static create method,
so that new instances are only returned by calling methods on other (factory / finder) objects,
and it goes all the way up to the root of all : the Naming Service.

As a result, **nothing**, not even a new() or a call to any static create() method references foo_stub.cpp. The code in the stubs happen to be implementation of the pure virtual methods of foo.h When the reply to a call made on a socket indicates that an object of class "foo" needs to be created,
the ORB implementation needs to locate the stub code for "foo".
Since the stub code is generated and the ORB implementation is independent of the user code, that "call" is not done in the classic way : instead, the ORB keeps a registry of all the stubs that **somehow** got registered ...

This is where the "ld -r" comes really handy :

All the "foo_stub.o", "bar_stub.o", and so on are my parts, "glued" for convinience according to the module they came from :
CosNaming_stub.o, AcmeBank_stub.o, etc.

By linking the main application with "CosNaming_Stubs.o", all the stubs present there register themselves,
and the ORB is happy.
The application developper is never exposed to the complexity of what / how many / which objects compose the relocatable "CosNaming_stubs.o", which is a good thing since it's implementation (of the ORB) dependent.

The only thing an application developper has to know is whether the app uses a module or not (they know that),
and if yes add the stubs for that module during the link. That's it.
The code comes on a module (in the CORBA term) per module basis, not per class or per method, and this is per design.

I have the same things for foo_poa.o on the server.

(*) OK, here's a question that may make these relocatables completely
redudant: Libtool already offers you (with convenience archives) the
possibility to merge a collection of objects completely into a shared
library.  If I understand correctly, then all that is missing would be a
way to merge a convenience archive completely into a program as well.
Right?

If that is the case, then we could simply add a link flag to libtool
that would just do that.  And I think it would be more efficient than
the whole relocating idea as well.  So, in terms of code, this test
case code should succeed:

: ${LIBTOOL=libtool}
: ${OBJEXT=o}
: ${NM=nm}
: ${CC=gcc}
echo 'int needed() { return 0; }' > a.c
echo 'int unneeded() { return 0; }' > b.c
$LIBTOOL --mode=compile $CC -c a.c
$LIBTOOL --mode=compile $CC -c b.c
$LIBTOOL --mode=link $CC -o libconv.la a.lo b.lo
cat >main.c <<\EOF
extern int needed();
int main()
{
 return needed();
}
EOF
$CC -c main.c
$LIBTOOL --mode=link $CC -o main main.$OBJEXT \
        -use-as-whole-archive libconv.la
$LIBTOOL --mode=execute $NM main | grep unneeded

I am not sure I understand what you mean precisely by the "unneeded()" function, and if it's part of the final binary or not (is the grep supposed to find it ?),
but this brings an interresting idea :

Basically, a possibility is forcing to call a "hook" function in each object,
so that the linker will finally see the light, err, dependencies.

In this case,
the hook functions are not "needed" or "unneeded" but rather "need_a()", "need_b()", "need_c()" etc,
with one hook per object.
The main will then need to call **each** need_a(), need_b() and so on to force everything needed to load

However, I would rather have the main code call :
hook_glued()
rather than call the hook function for each parts,
and the way to do that would be ...
... ld -r -o glued.o hook_glued.o part1.o part2.o ... partN.o ?
Wait, we are back to what started this thread !

An interresting point still is that the idea of hook functions can be leveraged in the dynamic case (code that ends up in glued.la), as a compile time alternative (call hook() from main) to the runtime one (call dlopen from main). With pure static code (glued.o given as is to the linker), hook functions are not needed with relocatables (which is my primary use case).

Hook functions put aside and looking at the "-use-as-whole-archive" idea,
the part I don't understand about all this is :

- Libtool seems to be already capable of linking with "ld -r", which is a good thing (well, there is that bug about ld -r -o .libs/glued.o .libs/input.o, but it's a different topic).

- with my proposal, changes would be on automake only to provide a syntax to use libtool "as-is",

- with your proposal -- if I understand correctly --, changes would be on libtool only (new -use-as-whole-archive flag) to avoid using relocatables, so that it works with the current automake by overloading existing syntaxes.

What did I miss ?

Since there is a change involved in each case,
the best would probably to put the change where it makes the more sense.

I can be convinced in other ways (beer helps too),
but my take so far is that it's about exposing in automake things that libtool can do already.

Cheers,
Marc Alff.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]