[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fortran libraries on the Blue Gene with mpi
From: |
Christian Rössel |
Subject: |
Re: Fortran libraries on the Blue Gene with mpi |
Date: |
Wed, 29 Apr 2009 14:40:24 +0200 |
User-agent: |
Thunderbird 1.5.0.14 (X11/20060911) |
Ralf Wildenhues wrote:
> * Christian Rössel wrote on Mon, Apr 27, 2009 at 05:33:33PM CEST:
>> Ralf Wildenhues wrote:
>>>>> # XL, BG
>>>>> cd build-bgxl
>>>>> ../configure CC=bgxlc CXX=bgxlC F77=bgfort FC=bgxlf95 GCJ=no \
>>>>> LDFLAGS=-qnostaticlink
>>>>> make
>>>>> make -k check VERBOSE=yes 2>&1 | tee checklog-bgxl-1
>>>>> cd ..
>>> This is where things start to get interesting.
>> With the bg* compilers we build programs that are supposed to be run on
>> the compute nodes. They may also run on the login-nodes, but you can't
>> take that for granted (AFAIR the error "Illegal instruction" appears if
>> you try to run a compute node program on a login node).
>
> Ah! I completely misunderstood that. That means that all those builds
> should run with cross-compiling enabled. Cross compilation mode is
> enabled when --host is passed (and differs from either the passed
> --build flag, or whatever configure computes as the build name);
> you can also force cross compilation mode through the hack of passing
> cross_compiling=yes
>
> to configure. The --host argument will also cause configure to look for
> all tool chain programs with a $host- prefix, in this case, with
> --host=powerpc-bgp-linux that would be powerpc-bgp-linux-gcc etc.
>
>> As all tests run
>> on the login-nodes, we should expect failures. Also, a test that
>> succeeds on the login node may not succeed on the compute node. IMHO all
>> test programs build with bg* and mpi* compilers should be run on the
>> compute nodes, not on the login nodes.
>
> Well, in this case they should not be run at all, at least not those
> that are run as part of the configure script.
>
>> To run a program on the compute nodes you write a batch script and
>> submit it to a queue. This process unfortunately differs from machine to
>> machine. It is also not sensible to submit many small jobs to the queue
>> as one job allocates at least 128 nodes.
>
> :-)
>
>> Maybe there is a way of calling
>> all tests from a single batch script so that one has to submit only one job.
>
> Not really. For those that failed on the login nodes, you can try to
> submit one or two to the queue; if they then pass, I'd be pretty
> confident that the others will work, too.
Hi Ralf, hi John,
I configured build-bgxl for cross compiling, reran the tests and found
two illegal instruction in the checklog:
f77demo-exec.test: === Running f77demo-exec.test
f77demo-exec.test: === Executing uninstalled programs in build-bgxl3
tests/defs: line 1132: 10739 Illegal instruction tests/f77demo/fprogram
f77demo-exec.test: ../tests/f77demo-exec.test: cannot execute
tests/f77demo/fprogram
f77demo-exec.test: === This may be ok since you seem to be cross-compiling.
fcdemo-exec.test: === Running fcdemo-exec.test
fcdemo-exec.test: === Executing uninstalled programs in build-bgxl3
tests/defs: line 1132: 20901 Illegal instruction tests/fcdemo/fprogram
fcdemo-exec.test: ../tests/fcdemo-exec.test: cannot execute
tests/fcdemo/fprogram
fcdemo-exec.test: === This may be ok since you seem to be cross-compiling.
I wanted to run these two on the compute nodes, but after finishing the
tests, the programs were gone.
>>> Test failures:
>>>
>>> - f77demo-* in the old testsuite
>>> This is because the bgfort command does not exist.
>>> It was a typo, should have been F77=bgfort77 or F77=bgf77 or F77=bgxlf
>>> I guess. If you have energy left, here's how you can rerun those
>>> tests:
>>>
>>> cd build-bgxl
>>> ../configure CC=bgxlc CXX=bgxlC F77=bgfort77 FC=bgxlf95 GCJ=no \
>>> LDFLAGS=-qnostaticlink
>>> gmake
>>> gmake -k check VERBOSE=yes TESTSUITEFLAGS='-k F77' TESTS="\
>>> tests/f77demo-static.test \
>>> tests/f77demo-make.test \
>>> tests/f77demo-exec.test \
>>> tests/f77demo-conf.test \
>>> tests/f77demo-make.test \
>>> tests/f77demo-exec.test \
>>> tests/f77demo-shared.test \
>>> tests/f77demo-make.test \
>>> tests/f77demo-exec.test"
>> Please find the results attached (checklog-bgxl-2).
>
> Thanks. f77demo-exec.test fails after f77demo-static.test, and
> f77demo-make.test fails after f77demo-{conf,shared}.test. The first
> failure is an "Illegal instruction" again, for which we have an
> explanation now; the other two are again:
>
> /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: attempted
> static link of dynamic object `./.libs/libfoo.so'
>
> I still don't know the cause for this; but at least the F77 cases look
> just like the FC cases.
>
> Can you post the output of the following?
>
> cd build-bgxl/tests/fcdemo
> /bin/sh ./libtool --mode=link bgxlf95 -Wc,-v -g -qnostaticlink -o
> fprogram fprogram.o libfoo.la libfoo3.la -ldl
libtool: link:
LD_RUN_PATH="/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-bgxl3/_inst/lib:"
bgxlf95 -Wl,-v -g -qnostaticlink -o .libs/fprogram fprogram.o
./.libs/libfoo.so
/u/fzj301zm/BlueGene/fortran_libraries_on_the_blue_gene_with_mpi/libtool/build-bgxl3/tests/fcdemo/.libs/libfoo2.so
./.libs/libfoo3.so -ldl
/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/ld: attempted
static link of dynamic object `./.libs/libfoo.so'
GNU ld version 2.17
Cheers,
Christian
PS: I'll have no testing resources available until May 5th ;-)
- Re: Fortran libraries on the Blue Gene with mpi, (continued)
Re: Fortran libraries on the Blue Gene with mpi, Christian Rössel, 2009/04/22
Re: Fortran libraries on the Blue Gene with mpi, Ralf Wildenhues, 2009/04/25
Message not available