automake
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cuda compilation


From: Tomas Oberhuber
Subject: Re: cuda compilation
Date: Fri, 8 Jan 2010 16:33:02 +0100
User-agent: KMail/1.12.2 (Linux/2.6.31-14-generic; KDE/4.3.2; x86_64; ; )

Hi Ralph,

Dne středa 06 Leden 2010 08:44:57 Ralf Wildenhues napsal(a):
> Hello Tomas,
> 
> * Tomas Oberhuber wrote on Sat, Jan 02, 2010 at 11:33:46AM CET:
> > Now I try to compile whole project with nvcc. It seems to work but I get
> > this
> >
> > ibtool: link:
> > nvcc -shared -nostdlib   
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugGroup.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugParser.
> >o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebug.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugScanner
> >.o 
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlParameterConta
> >iner.o .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlString.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerCPU.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerRT.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ion.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionScanner.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mpi-supp.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTester.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionParser.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlObject.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-compress-file.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mfilename.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlLogger.o  
> > .libs/libtnl-0.1.lax/libtnlmatrix-0.1.a/libtnlmatrix_0_1_la-tnlBaseMatrix
> >.o   -L/usr/local/cuda/lib64 -lcppunit -lcudart     -Wl,-soname
> > -Wl,libtnl-0.1.so.0 -o .libs/libtnl-0.1.so.0.0.0 nvcc fatal   : Unknown
> > option 'nostdlib'
> >
> > which means that nvcc is also used as linker. Even if I remove -nostdlib,
> > nvcc complains about other parameters. So I think it would be better to
> > link with g++. Can I change linker somehow? And in that case if I do it
> > by hand (copy the command on the command line and replace nvcc by g++) I
> > get this
> >
> > g++ -shared -nostdlib   
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugGroup.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugParser.
> >o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebug.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugScanner
> >.o 
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlParameterConta
> >iner.o .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlString.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerCPU.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerRT.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ion.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionScanner.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mpi-supp.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTester.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionParser.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlObject.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-compress-file.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mfilename.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlLogger.o  
> > .libs/libtnl-0.1.lax/libtnlmatrix-0.1.a/libtnlmatrix_0_1_la-tnlBaseMatrix
> >.o   -L/usr/local/cuda/lib64 -lcppunit -lcudart     -Wl,-soname
> > -Wl,libtnl-0.1.so.0 -o .libs/libtnl-0.1.so.0.0.0 /usr/bin/ld:
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when
> > making a shared object; recompile with -fPIC
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o: could not read symbols: Bad value
> > collect2: ld returned 1 exit status
> >
> > Or maybe we can solve it using -Xcompiler nad -Xlinker. May I ask what
> > does libtool do now in case we use nvcc to compile or link?
> 
> You're right.  Libtool doesn't support CXX=nvcc yet, and we also forgot
> some bits of CC=nvcc support.  This still needs to be done in Libtool.
> 
> Thanks,
> Ralf
> 

after my last patch to 'fix' compilation with CUDA I was working on problem 
with dependencies. It seems to be solved completely, however, the solution is 
not the most elegant I would imagine. 
First I have added this to depcomp in automake:
diff -r automake/lib/depcomp 
/home/oberhuber/workspace/automake-1.11.1/lib/depcomp
124a125,147
> nvcc)
> ## nVidia CUDA 2.3 compiler combined with gcc3
> ## here we just add -Xcompiler parameter to pass
> ## gcc3 parameters to gcc3
>   for arg
>   do
>     case $arg in
>     -c) set fnord "$@" -Xcompiler -MT -Xcompiler "$object" -Xcompiler -MD -
Xcompiler -MP -Xcompiler -MF -Xcompiler "$tmpdepfile" "$arg" ;;
>     *)  set fnord "$@" "$arg" ;;
>     esac
>     shift # fnord
>     shift # $arg
>   done
>   "$@"
>   stat=$?
>   if test $stat -eq 0; then :
>   else
>     rm -f "$tmpdepfile"
>     exit $stat
>   fi
>   mv "$tmpdepfile" "$depfile"
>   ;;
> 
It is good for ./configure ti find out that 
"checking dependency style of nvcc... nvcc"
As I learned then gcc3 does not use depcomp but instead it supports fast 
dependencies - fastdep. Therefore I introduced fastdepnvcc as follows

diff -r automake/m4/depend.m4 
/home/oberhuber/workspace/automake-1.11.1/m4/depend.m4
155a156,158
> AM_CONDITIONAL([am__fastdepnvcc$1], [
>   test "x$enable_dependency_tracking" != xno \
>   && test "$am_cv_$1_dependencies_compiler_type" = nvcc])

The idea now was to generate same piece od code to makefiles as for gcc but 
with -Xcompiler inside - like this

diff -r automake/lib/am/depend2.am 
/home/oberhuber/workspace/automake-1.11.1/lib/am/depend2.am
73a74,84
> if %FASTDEPNVCC%
> ## Fast-dep mode for nvcc is similar to gcc
> ## We just add -Xcompiler flag.
> ?!GENERIC?    %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler 
> -MD 
-Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ% 
%SOURCEFLAG%`test -f '%SOURCE%' || echo '$(srcdir)/'`%SOURCE%
> ?!GENERIC?    %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??!SUBDIROBJ? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -
Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o 
%OBJ% %SOURCEFLAG%%SOURCE%
> ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??SUBDIROBJ?  %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$|
$(DEPDIR)/&|;s|\.o$$||'`;\
> ?GENERIC??SUBDIROBJ?  %COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler -
MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ% 
%SOURCEFLAG%%SOURCE% &&\
> ?GENERIC??SUBDIROBJ?  $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> else !%FASTDEPNVCC%
86a98
> endif !%FASTDEPNVCC%
88a101
> 
101a115,125
> if %FASTDEPNVCC%
> ## In fast-dep mode, we can always use -o.
> ## For non-suffix rules, we must emulate a VPATH search on %SOURCE%.
> ?!GENERIC?    %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% 
> -Xcompiler 
-MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ% 
%SOURCEFLAG%`if test -f '%SOURCE%'; then $(CYGPATH_W) '%SOURCE%'; else 
$(CYGPATH_W) '$(srcdir)/%SOURCE%'; fi`
> ?!GENERIC?    %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??!SUBDIROBJ? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% 
-Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o 
%OBJOBJ% %SOURCEFLAG%`$(CYGPATH_W) '%SOURCE%'`
> ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??SUBDIROBJ?  %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$|
$(DEPDIR)/&|;s|\.obj$$||'`;\
> ?GENERIC??SUBDIROBJ?  %COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% -Xcompiler 
-MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ% 
%SOURCEFLAG%`$(CYGPATH_W) '%SOURCE%'` &&\
> ?GENERIC??SUBDIROBJ?  $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> else !%FASTDEPNVCC%
114a139
> endif !%FASTDEPNVCC%
131a157,166
> if %FASTDEPNVCC%
> ## fast-dep mode for nvcc only add -Xcompiler
> ?!GENERIC?    %VERBOSE%%LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% 
> -Xcompiler 
-MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %LTOBJ% 
%SOURCEFLAG%`test -f '%SOURCE%' || echo '$(srcdir)/'`%SOURCE%
> ?!GENERIC?    %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo
> ?GENERIC??!SUBDIROBJ? %VERBOSE%%LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% 
-Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o 
%LTOBJ% %SOURCEFLAG%%SOURCE%
> ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo
> ?GENERIC??SUBDIROBJ?  %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$|
$(DEPDIR)/&|;s|\.lo$$||'`;\
> ?GENERIC??SUBDIROBJ?  %LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% -
Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o 
%LTOBJ% %SOURCEFLAG%%SOURCE% &&\
> ?GENERIC??SUBDIROBJ?  $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo
> else !%FASTDEPNVCC%
140a176
> endif !%FASTDEPNVCC%

It was also necessary to introduce am__fastdepnvcc to automake:

diff -r automake/automake.in 
/home/oberhuber/workspace/automake-1.11.1/automake.in
1392,1395c1378,1387
<       my ($AMDEP, $FASTDEP) =
<         (option 'no-dependencies' || $lang->autodep eq 'no')
<         ? ('FALSE', 'FALSE') : ('AMDEP', "am__fastdep$fpfx");
< 
---
>         # my ($AMDEP, $FASTDEP, $FASTDEPNVCC) =
>       #   (option 'no-dependencies' || $lang->autodep eq 'no')
>       #   ? ('FALSE', 'FALSE', 'FALSE' ) : ('AMDEP', "am__fastdep$fpfx", 
"am__fastdepnvcc$fpfx");
>         # 
>         # print $FASTDEPNVCC
>         
>         my ($AMDEP, $FASTDEP, $FASTDEPNVCC) =
>          (option 'no-dependencies' || $lang->autodep eq 'no')
>          ? ('FALSE', 'FALSE', 'FALSE' ) : ('AMDEP', "am__fastdep$fpfx", 
"am__fastdepnvcc$fpfx");
>          
1403a1396
>                          'FASTDEPNVCC' => $FASTDEPNVCC,
6369a6340
>   am__fastdepnvccCC => 'AC_PROG_CC',
6371a6343
>   am__fastdepnvccCXX => 'AC_PROG_CXX',

At this moment I had correct Makefile but -Xcompiler argument was filtered out 
by libtool. I fixed it like this:

diff -r libtool/libltdl/config/ltmain.m4sh 
/home/oberhuber/workspace/libtool-2.2.7a/libltdl/config/ltmain.m4sh
724,727c724,729
<       -Xcompiler)
<         arg_mode=arg  #  the next one goes into the "base_compile" arg list
<         continue      #  The current "srcfile" will either be retained or
<         ;;            #  replaced later.  I would guess that would be a bug.
---
> #     -Xcompiler)
> #       arg_mode=arg  #  the next one goes into the "base_compile" arg list
> #       continue      #  The current "srcfile" will either be retained or
> #       ;;            #  replaced later.  I would guess that would be a bug.
> #        I think that this is a bug. Usualy we wnat to pass this to nvcc 
which
> #        then pass the next arg to gcc.

Now I was able to competely compile my project and hope that I had correct 
dependencies. However I found out, that my .cu sources are still omitted by 
automake. I did not understand your sugestion (resp. how to do it).

>Alternatively, you could write a .cu.lo rule that looks like the
>automake-generated .c.lo rule, has --tag=CC but uses $(NVCC); you'd then
>still need a nvcc-wrapper that translates '-fPIC' to '-Xcompiler -fPIC'
>for nvcc.  Ugly, yes, but I'm not sure how to do this any nicer at the
>moment.

so I just told automake, that .cu files can be accepted by CXX comiler.

diff -r automake/automake.in 
/home/oberhuber/workspace/automake-1.11.1/automake.in
766c766
<                  'extensions' => ['.c++', '.cc', '.cpp', '.cxx', '.C']);
---
>                  'extensions' => ['.c++', '.cc', '.cpp', '.cxx', '.C', 
> '.cu']);

It seems to work now but I see that it is not very clear solution. I would 
prefer to introduce new language CUDA C and CUDA C++ fro example like this:

register_language ('name' => 'nvc',
                   'Name' => 'CUDA C',
                   'config_vars' => ['NVCC'],
                   'ansi' => 1,
                   'autodep' => '',
                   'flags' => ['NVCFLAGS', 'NVCPPFLAGS'],
                   'ccer' => 'NVCC',
                   'compiler' => 'COMPILE',
                   'compile' => '$(NVCC) $(DEFS) $(DEFAULT_INCLUDES) 
$(INCLUDES) 
$(AM_CPPFLAGS) $(NVCPPFLAGS) $(AM_CFLAGS) $(NVCFLAGS)',
                   'lder' => 'CCLD',
                   'ld' => '$(CC)',
                   'linker' => 'LINK',
                   'link' => '$(CCLD) $(AM_CFLAGS) $(CFLAGS) $(AM_LDFLAGS) 
$(LDFLAGS) -o 
$@',
                   'compile_flag' => '-c',
                   'libtool_tag' => 'NVCC',
                   'extensions' => ['.cu'],
                   '_finish' => \&lang_c_finish);

and then compil only .cu files with nvcc. I have tried to do so but it would 
require much more work. I am willing to do it if someone would guide me. I 
think that some autoconf tests like AC_PROG_NVCC and AC_PROG_NVCXX might be 
useful. These test may define NVCC and NVCXX variables. Probably we should 
start here. I would be glad if you could incorporate my patches eventhough I 
know they are not very nice :).  If you have any other sugestions I would be 
glad to read them.

Cheers Tomas.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]