bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: GCC optimizes integer overflow: bug or feature?


From: Dave Korn
Subject: RE: GCC optimizes integer overflow: bug or feature?
Date: Wed, 20 Dec 2006 20:01:50 -0000

On 20 December 2006 16:25, Matthew Woehlke wrote:

> Dave Korn wrote:
>> On 20 December 2006 02:28, Andrew Pinski wrote:
>>>> Paul Brook wrote:
>>>>> Pretty much all optimization will change the behavior of your program.
>>>> 
>>>> Now that's a bit TOO strong a statement, critical optimizations like
>>>> register allocation and instruction scheduling will generally not change
>>>> the behavior of the program (though the basic decision to put something
>>>> in a register will, and *surely* no one suggests avoiding this critical
>>>> optimization).
>>> 
>>> Actually they will with multi threaded program, since you can have a case
>>> where it works and now it is broken because one thread has speed up so
>>> much it writes to a variable which had a copy on another thread's stack.
>>> So the argument about it being too strong is wrong because timing matters
>>> now a days.  Instruction scheduling can cause the same issue as it forces
>>> a write too early for another thread to act on.
>> 
>> Why isn't that just a buggy program with wilful disregard for the use of
>> correct synchronisation techniques?
> 
> Excuse me while I beat you on the head with a number of real-life
> examples of *lock-free* algorithms. :-) 

  Thanks, I've heard of them, in fact I've implemented many, debugged many,
and currently work with several every single day of the week.

> Particularly lock-free queues
> (google should help you find one that applies here), whose correct
> operation is critically dependent on the order in which the loads and
> stores are performed. 

  No, absolutely not.  Lock-free queues work by (for example) having a single
producer and a single consumer, storing the queue in a circular buffer, and
assigning ownership of the queue's head pointer to the producer and the
(chasing) tail pointer to the consumer.  The whole point about lock-free
algorithms is that they are guaranteed to work *regardless* of the order of
operations between the two asynchronous processes that interact with the
queue, given only the ability to perform an atomic read or write.  The
ordering is critical within a single thread of execution; e.g. you must fill
in all the details of the new entry you are adding to the queue /before/ you
increment the head pointer, whereas in a locking algorithm it wouldn't matter
if you incremented the head pointer first, then filled in the blank entry you
had just moved it past, because you'd do both within the lock.

  If you design a lock-free algorithm that relies on two threads being
precisely synchronised, it's not really lock-free, it just has 'implicit'
locking; and if your precise synchronisation depends on the precise
cycle-timing of low-level instruction sequences, you have made a very very
fragile design that works, if it does, by good fortune.

> This is a very real, entirely legitimate example
> where the compiler thinking it knows how to do my job better than I do
> is wrong.

  Nope, this is a very real example of a bug in your algorithmic design, or of
you misleading the compiler, or relying on undefined or implementation-defined
behaviour.
 
> We (in a major, commercial application) ran into exactly this issue.
> 'asm volatile("lock orl $0,(%%esp)"::)' is your friend when this happens
> (it is a barrier across which neither the compiler nor CPU will reorder
> things). Failing that, no-op cross-library calls (that can't be inlined)
> seem to do the trick.

  This simply means you have failed to correctly declare a variable volatile
that in fact /is/ likely to be spontaneously changed by a separate thread of
execution.

  And relying on a library call to act as a memory barrier is risky.  The only
way in which it works is because it takes long enough that there's enough time
for the CPU's load-store unit to have completed any posted writes, but if (for
a contrived example) you're using a cut-down embedded library and the
particular routine that you call happens to be a stub and just returns
immmediately, you've got a two-instruction call-return sequence that very
seriously is not a guarantee of externally-visible memory access ordering.
Now, the compiler *does* know not to optimise moves across library calls, but
you're investing too much trust in them if you think the CPU's internal units
know or care whether you're in a libcall or your mainline code.


    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....





reply via email to

[Prev in Thread] Current Thread [Next in Thread]