// Then the following is called:
void compute(){
u *= cx(I)*cy(J); // runs 4X slower with icc
than KCC
}
When I time this routine, I find that it runs about 4X
slower when compiled with Intel's icc (Version 7, -O3 -DNOPAssert
-DNOCTASSERT) than with KCC (version 4.0f, +K3 -DNOPAssert,
-DNOCTAssert). As expected, the KCC version runs as fast as
hand-written loops.
Do others observe this same sluggish behavior with icc? Am I
missing some obvious compile flag?