discuss-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ProjectCenter Editor parenthesis highlighting segfault


From: Ivan Vučica
Subject: Re: ProjectCenter Editor parenthesis highlighting segfault
Date: Wed, 27 Aug 2014 18:36:50 +0100

On Wed, Aug 27, 2014 at 6:11 PM, Wolfgang Lux <wolfgang.lux@gmail.com> wrote:

> I expected that i could never get greater than 1 in the body of the loop as the boolean _expression_
> must return false when i reaches 2. I ran this several times and i always reaches 3 bevor the segfault
> occurs.

That looks like a serious bug in your compiler then. Your reasoning is right, the body should not be entered with any value of i greater than 1 and if it does the compiler is generating incorrect code.

Not quite a bug in compiler; instead, buggy code. This is the expected behavior of a modern compiler. How come?

tl;dr: Violations of C standards, such as a very obvious out-of-bounds access, permit the compiler to perform nasty counter-intuitive assumptions.

I recently read [1], which is very interesting. Very interesting are are various links in that article, too. If you're writing C code (and most people on this list are), you'd do yourself a service to read [1]. While [1] is directly relevant, additionally interesting links are [2], [3].

[1] makes me want to avoid false brevity of code where inappropriate.

Quoting Raymond Chen and adding bold and underline:

Consider the following function:
int table[4];
bool exists_in_table(int v)
{
    for (int i = 0; i <= 4; i++) {
        if (table[i] == v) return true;
    }
    return false;
}

 
What does this have to do with time travel, you ask? Hang on, impatient one.
 
First of all, you might notice the off-by-one error in the loop control. The result is that the function reads one past the end of the table array before giving up. A classical compiler wouldn't particularly care. It would just generate the code to read the out-of-bounds array element (despite the fact that doing so is a violation of the language rules), and it would return true if the memory one past the end of the array happened to match. 
 
A post-classical compiler, on the other hand, might perform the following analysis:
 
- The first four times through the loop, the function might return true.
- When i is 4, the code performs undefined behavior. Since undefined behavior lets me do anything I want, I can totally ignore that case and proceed on the assumption that i is never 4. (If the assumption is violated, then something unpredictable happens, but that's okay, because undefined behavior grants me permission to be unpredictable.) 
- The case where i is 5 never occurs, because in order to get there, I first have to get through the case where i is 4, which I have already assumed cannot happen.
- Therefore, all legal code paths return true.
 
As a result, a post-classical compiler can optimize the function to
bool exists_in_table(int v)
{
    return true;
}

As a side remark: the code currently looks generally unappealing, and so does the code in -highlightCharacterAt:. If someone feels like updating it, why not refactor this misuse of for-loops a bit, to instead use two variables instead of one array, and add appropriate helper functions that perform the actual highlight/unhighlight? If properly named, these two variables would also make it clearer that these mark locations of the beginning and the end of a parenthesis-surrounded block that are supposed to be currently and individually highlit. highlitOpeningParenthesisLocation and highlitClosingParenthesisLocation sound appropriate. 

Readability is as important as brevity.

[1]: http://blogs.msdn.com/b/oldnewthing/archive/2014/06/27/10537746.aspx
[2]: http://blog.regehr.org/archives/759
[3]: http://blog.regehr.org/archives/767

reply via email to

[Prev in Thread] Current Thread [Next in Thread]