[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [pooma-dev] KCC versus icc
From: |
Richard Guenther |
Subject: |
Re: [pooma-dev] KCC versus icc |
Date: |
Thu, 13 Mar 2003 11:15:12 +0100 (CET) |
On Thu, 27 Feb 2003, Paul A. Renard wrote:
> Richard (and Jeffrey);
>
> I've tried various combinations of -O2, -ip, and -ipo. Both -ip options
> make
> the loop represented by the data-parallel expression run slower. I also tried
> profile-directed optimization, and that did indeed make a very-slight
> improvement (on the order of a percent), but icc is still producing code for
> that particular loop that is 3.5X-4X slower. The compiler seems to be
> inlining
> quite a bit. In fact, all the constructors and destructors look like they are
> inlined. KCC doesn't seem to inline the constructors/destructors for this
> case. Both cases eventually call the evaluator, and at that point I get lost
> in the machine code.
I finally got to look what options I used to get at least some performance
out of icc. The key was to bump the insn number for always inlined small
functions artificially high, i.e. I used something along
icpc -O2 -unroll -xM -tpp6 -ip -Qoption,c,-ip_ninl_min_stats=1000
This way I get the same performance as gcc 3.3 when using --param
max-inline-slope=1000000 (icpc is slightly faster for in-cache operation -
but who has data that fits into cache...)
Hope this helps.
Richard.
--
Richard Guenther <address@hidden>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [pooma-dev] KCC versus icc,
Richard Guenther <=