bug-make
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GNU make 4.2.90 release candidate available


From: Dennis Clarke
Subject: Re: GNU make 4.2.90 release candidate available
Date: Tue, 27 Aug 2019 12:20:15 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:69.0) Gecko/20100101 Thunderbird/69.0

On 8/27/19 8:33 AM, Paul Smith wrote:
> On Tue, 2019-08-27 at 01:21 -0400, Dennis Clarke wrote:
>> On 8/26/19 10:59 PM, Paul Smith wrote:
>>> On Mon, 2019-08-26 at 19:33 -0400, Dennis Clarke wrote:
>>>> I'll dig into this but on RHEL 7.4 x86_64 we see :
>>>>
>>>> src/job.c: In function 'reap_children':
>>>> src/job.c:754:17: error: incompatible type for argument 1 of 'wait'
>>>>                     EINTRLOOP (pid, wait (&status));
>>>
>>> That is REALLY disturbing, because it means that configure couldn't
>>> detect either waitpid() or wait3(), both of which have been available
>>> since approximately forever.  Something went badly wrong with the
>>> configure run on this system.
>>
>> I know and it was a show stopper for me.  I just had to stop and stare
>> at my coffee cup after trying this across a bunch of machines.
>
> If I'm reading the downloads you provided correctly it seems there are
> a bunch of extra flags being added to the compile and link lines:
>
>> CFLAGS='-std=iso9899:1999 -m64 -g -march=opteron -Wl,-
>> rpath=/opt/bw/lib,--enable-new-dtags -fno-builtin -O0 -malign-double
>> -mpc80'
>> export CFLAGS
Well I have been building software for a while now and none of those
flags are a problem. However let's take a look :

(1)  -std=9899:1999    This is the same as c99 and it merely means that
GCC *should* make every reasonable attempt to comply with the C99 code
specification.  The GCC manual actually says "GCC has substantially
complete support for this standard version" but that does not mean it
is complete.  GNU make may be in the C89 world or maybe C99.  I do not
know but I can make it compile clean on a very very strict standards
compliant UNIX server with no problem.  So I think C99 is perfectly
safe here. I know from extensive experience and a bit of pain that the
OpenSSL project is firmly entrenched into C89 simply because of the old
massive list of systems that OpenSSL needs to run on.

    https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Standards.html#C-Language

(2) This was most likely done on a 64 bit machine and thus -m64 makes
perfect sense here. On the 32 bit i686 and armv7 machines I would need
to specify -m32 or perhaps far more specific options to target a very
specific ARM/Rockchip type SoC like -march=armv7-a and then even a
-mtune=cortex-a17 with -mfpu=vfpv4-d16 which assures me the correct
output assembly being created.

(3) In general I like to be able to single step and debug my way
through lines or whatever so I think a -g is safe everywhere.

(4) On the 64-bit x86_64 type machines where I have both Intel and the
AMD variants on hand it seems like a good idea to go with a baseline
feature set that works cross platform everywhere.  It is possible to go
with a trivial -march=x86-64 however I seem to recall that the first
release of viable 64-bit processors was done by AMD and they used the
product name "Opteron" at the time.  This was a massively successful
processor design which Sun Microsystems jumped on and they started to
push out servers by the truck load. Very quickly we saw the open source
world flood in and before you knew it everything was "AMD64" here there
and everywhere. Here we are about eighteen years later and the original
"SledgeHammer" name has been forgotten but every Debian kernel released
for everything 64-bit always says amd64 in the revision tag such as the
current stable 4.19.0-5-amd64.  Regardless of the "amd64" tag most of
the GNU host and target triplets will utter :

    checking build system type... x86_64-pc-linux-gnu
    checking host system type... x86_64-pc-linux-gnu

So that is fine.  The k8 or opteron tag merely indicates that there are
a very few 64-bit extensions to the opcode instruction set on the chip
and https://en.wikipedia.org/wiki/Opteron#Two_key_capabilities sort of
shows the history of what happened back then.  Everyone jumped on the
k8 and Intel still had their Itanium which was not selling well. Most
folks don't know that Sun Microsystems had a beautiful ready to release
port of their Solaris UNIX operating system for the Itanium but it was
wrapped in plastic and shoved into a shelf somewhere in the same large
warehouse seen in Raiders of the Lost Ark. Along with the port done for
the IBM Power based mainframes but that saw the light of day at the
press release and COMDEX and then quickly hidden. However I digress.

(5) -Wl,-rpath=/opt/bw/lib has nothing to say to the compiler but it is
good instructions for the linker stage wherein we want the output ELF
dynamic section to include RPATH data. The horror show usage of things
like LD_LIBRARY_PATH should be avoided at all costs as that leads to
slow insanity. A binary executable should have some hint for the run
time linker where to look for whatever libs it needs.  Also given that
this is a test release I wanted full isolation and that means a clean
slate with *nothing* in the build path or even the existence of any lib
to interfere with testing.  Thus I point to an empty directory.

(6) However we should talk about the --enable-new-dtags a bit.  I can
not recall when Ulrich Drepper extended the ELF file format to include
new data tags.  It was at least a decade ago however the history on
this is unclear to me at this moment over coffee.  However we do know
that most modern linkers in the GNU ELF world support these new dtag
fields in addition to the usual DT_NEEDED and DT_RPATH and DT_RUNPATH
and other bits in the dynamic section.  This is harmless and does not
affect the compile stage.

(7) Okay now we hit one of my favorite little hidden GCC features. The
fact is that the GCC C compiler will perform changes and slight little
optimizations to your code even if you don't want that. Please take a
little look at :

https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/x86-Built-in-Functions.html#x86-Built-in-Functions

When trying to teach students about the C language and the various
stages of a compile I have often run head long into these secret little
builtin changes that GCC will do. Go ahead and translates "hello world"
from C into the local machines assembly language and then take a look
at what you get.  There will not be a call to "printf".  This usually
will baffle students as they wrote one thing simply and then the result
is not based on the input C source code anymore. What happened?  Well
GCC happened with the internal built in little changes and slight code
optimizations that it does.  Silent changes. Unless you say do not do
that please.  To get some bit of code off the ground it is harmless to
say to GCC please don't change my source code.  Thus we specify the
-fno-builtin flag and that solves that.  You get what you asked for.

(8) This is obvious. Here we have an optimization switch that says zero
optimization. I should not need to say this but one thing I have learned
about the GCC compiler is that it is best to be specific.  The default
is -O0 which is documented as :

    -O0   Reduce compilation time and make debugging produce the
          expected results. This is the default.

https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Optimize-Options.html#Optimize-Options

(9) The memory alignment on x86 for both the 32-bit machines and the 64
bit boxes may be specified as two-word boundary aligned with the neat
little flag -malign-double.  However this is the default anyways on all
64 bit x86 architecture machines.  Does not hurt to be specific.

https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/x86-Options.html#x86-Options

(10) This is a floating point option.  The -mpc80 is sort of a verbose
way of saying that you know you are on an x86 box and it will never
ever be able to do anything more than 10-bytes of data inside extended
double precision variables.  The x86 architecture is crippled by design
in this way and much has been written on the topic. Thankfully the IEEE
754 spec was updated and ratified back in 2008 to include support for
the 128-bit new datatype.  Sadly the hardware does not exist unless you
have IBM Power9 or the new spiffy RISC-V processors with the 128-bit
options. In fact the RISC-V specification includes 256-bit opcodes. So
what do we do?  Well we have the libquadmath library of course! A bit
of a software fix to get around this sad design flaw in the x86 world.

You can read all about it https://gcc.gnu.org/onlinedocs/libquadmath/
however it is not really cross platform ready and does not work at all
in the 64-bit RISC machine space for some platforms.

So that covers off the CFLAGS which are harmless.

>>
>> CPPFLAGS='-I/opt/bw/include -D_POSIX_PTHREAD_SEMANTICS -D_TS_ERRNO'
>> export CPPFLAGS
>>
Strangely enough I would expect to also see -D_LARGEFILE64_SOURCE there
also.  The TS_ERRNO will allow handling of a thread safe errno macro
expansion. The POSIX bits are pretty darn safe. I think we have been in
the large file support space for twenty years now so I don't see a
problem there.  As for the include path where nothing exists I don't see
a possible issue either.  Recall that I am testing something new here
and so I want a clear slate with no possible interactions with anything.

>> LDFLAGS='-Wl,-rpath=/opt/bw/lib,--enable-new-dtags'
>> export LDFLAGS
Same as above. Harmless stuff for the linker stage.

>> LD_RUN_PATH=/opt/bw/lib
>> export LD_RUN_PATH
That is an empty directory anyways but it does provide a suggestion to
the linker that yes we realyl want the DT_RUNPATH to point to a new
place.  This is because I was going to follow up with libiconv and then
GNU gettext there and re-build GNU Make on a second pass. We never did
get there did we?


> What are the results if you simply invoke ./configure without any
> settings for any of these build variables?
Confusion ?  I think the default is that you set " -g -O2 " and nothing
else.  Not sure why you want make optimized with -O2 when testing.
I sure don't.  There is only one machine where things build and test
clean and with no CFLAGS and no CPPFLAGS and no LDFLAGS ( is that even
a thing on GNU ELF ld? ) and no LD_OPTIONS and no LD_RUN_PATH and no
PKG_CONFIG_PATH and no RUNPATH we get nothing new.

No surprise.

On the 64-bit FreeBSD ppc64 machine we get the same failures.

However the 32-bit Linux boxen passes all tests.

This baffles me.

>>> Note that this is not a new error:
>>> it's always been like this.  What's new  is (a) the test which shows
>>> the problem, and (b) that it _doesn't_ fail on some sufficiently new
>>> systems.
>>
>> Right !!   I did this and had others watching and we were all wondering
>> why Debian stable was happy and Debian sid had a fit. Of course there
>> was also the x86_64 versus i686 architecture there too.
>
> Well, the situation is a bit complicated.  First, it depends on whether
> GNU make configure chooses the glob/fnmatch that comes with GNU make or
> the system version.
>
> If it chooses the one that comes with GNU make, you will see the error
> (!!) because our glob/fnmatch is too old.
>

How do we force the usage of the system version ?


> If it chooses the system version (configure detects that you have GNU
> libc) then it depends on whether the system version has this bug fixed
> in it or not.
>
> For Debian, it doesn't look to me like Debian sid had this error
> though; I didn't see any failure of the wildcard function in the logs
> for that.  There were other problems, seeming to do with order of
> output etc.
>

Thus far I can not get a pass anywhere except on Debian.  These CFLAGS
have me bothered.  I have been doing production code releases for well
over twenty years and have not run into a situation where really old
conservative compiler flags show this sort of effect.

>>> I thought about trying to handle this but (1) it's a bit annoying and
>>> (2) I didn't know if there were any systems that used this latter
>>> behavior.  Looks like there might be.
>>
>> Sounds like whiskey in my coffee cup tomorrow to look at this all again.
>
> I have a patch that may work for this posix_spawn() issue.  The only
> really tricky part is getting the error message right.
>
> If you wanted to just give it a try on one of the systems you could add
> the --disable-posix-spawn option to configure and see what happens.
>

Ah ha.  That sounds sweet.  I will give that a whirl.


--
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken
GreyBeard and suspenders optional



reply via email to

[Prev in Thread] Current Thread [Next in Thread]