[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [cygwin|mingw] dlpreopen and disable-static
From: |
Charles Wilson |
Subject: |
Re: [cygwin|mingw] dlpreopen and disable-static |
Date: |
Mon, 26 May 2008 11:49:57 -0400 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14) Gecko/20080421 Thunderbird/2.0.0.14 Mnenhy/0.7.5.666 |
Ralf Wildenhues wrote:
Hello Charles,
a couple of random thoughts:
* Charles Wilson wrote on Mon, May 26, 2008 at 12:53:28AM CEST:
What is interesting is that the current transformation steps are
incorrect if used with EITHER the import lib or the DLL; the file
generated using the static library is appropriate for both static and
dynamic linking, assuming --enable-auto-import (and possibly
--enable-runtime-pseudo-reloc).
In this paragraph, does the part after ";" contradict the one before?
If not, why not?
Three different libraries, two different linking cases (I'm ignoring
dlopen, here -- because the overall subcategory is dlpreopen, which
implies an actual build-time link to the target library).
library used
to generate static import dll
symbol table
linking exe
statically OK [1] X [2] X [3]
linking exe OK [4] X [5] X [6]
dynamically
[1] we expect this to work, and it does
[2,3] we do not expect this to work; I didn't test
[4] this actually works in practice, and IS in fact the way we operate
if the static library and the dynamic library are both present -- kinda
by accident. We deliberate choose the static library, and create a
symbol table from it. But then we appear to link against the dynamic
library, because it shows up in the linked exe's dependency chain. I
*think* this is a side effect of the linker's preference for .dll.a
given -l<name>, but it *could* be that libtool is explicitly forcing
this. I don't know. It is odd that this works, because while we link to
the dll, the symbol table has an entry for "libfoo.a" NOT
"cygfoo-N.dll". Maybe we never test dlpreopen using
'lt_dlopen()+lt_dlsym()'? helldl sure doesn't test that; it scans the
lt__PROGRAM__LTX_preloaded_symbols table manually.
[5] If only the dynamic lib is present, then this is what we should use
to generate the symbol list (e.g. the first entry in .la's
library_names). However, the existing cygwin|mingw $pipes do not do this
properly, even when I run them manually on the .dll.a -- DATA items are
mishandled, and the output contains a few symbols that ought to be omitted.
[6] This is what we actually use when only the dynamic lib is present,
to generate the symbol list. This is totally broken, as the generated
symbol list is completely bolluxed, and I don't see a good way to fix it
(short of switching to case [5] and fixing /that/).
We could exclude all "^_imp__*" symbols -- or all _imp__ symbols with
matching _nm_s, with the understanding that a simple $global_symbol_pipe
can't do that -- and transform "^_nm__*" symbols into s/^_nm__//.
Yes.
But that's only if we're using the link library, and not the DLL itself,
to generate the symbol list.
Why would we need to carry on symbols coming from the DLL and matching
^_imp__ ?
If --disable-auto-import, I think.
Do we allow --disable-auto-import for win32+dlpreopen?
Would dlpreopen work with msvc (which has no auto-import support)?
Also, if you *know* you are (dlpreopen)-linking to the dynamic library,
you could get a minor speedup by using the _imp__ symbols directly, I
think. But that would be a big change, from:
extern char _imp__hello;
extern int hello();
{"_imp__hello", (void *) &_imp__hello},
{"hello", (void *) &hello},
to:
extern int _imp__hello();
{"hello", (void *) &_imp__hello},
but I'm not even sure that would work, because _imp__hello is not the
function address (e.g. a pointer to the function), but a pointer TO the
pointer to the function. I think.
The import list currently generated from the DLL is a real mess. Just
running 'nm' on cyghello-0.dll gives a few interesting results:
6c641020 T _foo
6c6413c0 T _free
_free is from cygwin1.dll, not cyghello-0.dll. But _foo is from
cyghello-0.dll.
That's definitely a reason not to use the DLL, but always the import
lib.
How do you distinguish between them? (In this case,
_free has a matching _imp__free, but _foo does not. Is that always the
case? And besides, a simple pipe can't do that sort of elimination:
"don't include entries for X if _imp__X exists elsewhere in the input")
I don't see why you get to the conclusion that a simple pipe can't do
that.
Order. Suppose the nm output looked like
...
6c641020 T _foo
...
6c641028 I _imp__foo
...
As I process this input, I first see _foo. Do I emit an entry in the
symbol table or not? I haven't seen _imp__foo yet, so YES, I emit an entry.
Then, later, I see _imp__foo. Uh-oh: I need to go back and remove the
_foo entry.
"go back" == not something a simple pipe can do (you can't rewind pipe
input, or output). So, you need temporaries, an multiple pipes. By
definition, that's no longer a "simple pipe".
Or, consider the reverse case:
...
6c641020 I _imp__foo
...
6c641028 I _foo
...
As I process this, I first see _imp__foo. Ah-ha! I need to make a note
that IF I ever see _foo, I should NOT emit an entry in the symbol table.
This is possible using a higher-level interpreter as the "simple pipe"
interpreter -- e.g. perl, or possibly gawk. But certainly not sed, and
even if perl|gawk, the "pipe" contents will be a fairly complex,
multiline program, including internal storage of a hashtable for "seen
_imp__*" data. Again, not a "simple" pipe.
Thus, most platforms just need $global_symbol_pipe, but we'd need a few
different pipes, temporary files, and additional code
($global_symbol_pipe_cmds?) -- or a "pipe" implemented by a multiline
program in a higher-level interpreter like perl.
Not simple, and very Ick.
But anyway you are mixing up the way things are intended to work
with the apparent limitations you are seeing right now. Don't do that.
Let's first work out how things ought to work, then see how we can
adjust the code to actually do it. For the latter, there may be more
possibilities than are apparent.
OK. But with this caveat: I'm hesitant to make very large changes in the
way symbol processing works, in a 2.2.x release, to fix an apparently
long-standing bug nobody has noticed until now. If it is simple to fix,
no problem. But major re-architecting? No way: the
don't-use-shell-wrapper-at-all-on-win32 represented more than enough
major change in a micro release for me, thanks.
I'd rather leave it broken and document it: "on win32, dlpreopen works
only if --enable-static. If also --enable-shared, then the build-time
link will be against the shared library, but the static library must
also exist at build-time even if deleted at run-time."
Also, if applicable: "on win32, dlpreopen only works with the gnu
compiler, with --enable-auto-import support which is the default. Don't
expect it to work if --disable-auto-import, or if using one of the MSVC
wrappers"
I believe this stuff used to work, even when --disable-static. At one
point, I seem to remember that some $filter included some
transformations with _nm_ and _imp_, but I can no longer locate that in
the archives or online.
Hmm.
Found it:
http://lists.gnu.org/archive/html/libtool-patches/2003-11/msg00013.html
However, "back then" we used the import lib to extract symbols from, if
the static lib wasn't found. But "now" we use the dll -- and we've
already agreed that the import lib is the right one to use in this case,
at least on mingw|cygwin.
pre-2003 export_symbols_cmds (note m4-style escaping)
$NM $libobjs $convenience | $global_symbol_pipe | $SED -e
'\''/^[[BCDGS]] /s/.* \([[^ ]]*\)/\1 DATA/'\'' | $SED -e '\''/^[[AITW]]
/s/.* //'\'' | sort | uniq
fixed 2003 export_symbols_cmds (note m4-style escaping)
$NM $libobjs $convenience | $global_symbol_pipe | $SED -e
'\''/^[[BCDGS]] /s/.* \([[^ ]]*\)/\1 DATA/'\'' | $SED -e '\''/^.*
__nm__/s/^.* __nm__\([[^ ]]*\) [[^ ]]*/\1 DATA/'\'' | $SED -e '\''/^I
/d'\'' | $SED -e '\''/^[[AITW]] /s/.* //'\'' | sort | uniq
current export_symbols_cmds (note shell-style escaping):
\$NM \$libobjs \$convenience | \$global_symbol_pipe | \$SED
-e '/^[BCDGRS][ ]/s/.*[ ]\\\\([^ ]*\\\\)/\\\\1 DATA/' | \$SED -e
'/^[AITW][ ]/s
/.*[ ]//' | sort | uniq > \$export_symbols
I need to run git-blame on that section of code and trace its evolution,
I think.
--
Chuck