gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Nit


From: Tom Lord
Subject: Re: [Gnu-arch-users] Nit
Date: Wed, 22 Oct 2003 11:58:08 -0700 (PDT)


    > From: Robin Farine <address@hidden>

    >     Tom>    One thing _missing_ from this thread, other than the
    >     Tom>    intial chat about GError, is disucssion about the
    >     Tom>    _pattern_matching_ aspect of error handling.  GError
    >     Tom>    offers us domain/code which didn't really strike me
    >     Tom>    as compelling.

    > Since the idea is to check for errors at the call site, I think like
    > Andrew that a simple integer is sufficient. The values can be split
    > in two sets, a range from 0 to K-1 for common errors and the remaining
    > K..2^N-1 for library specific errors. Again, since errors are checked
    > at the call site, the caller knows the library the failing function
    > belongs to and thus can translate library specific codes into text
    > messages (e.g. by calling a per-library function that maps codes to
    > messages).


Whoa...slow down.   Couple problems there.

First: I hope it's possible for you and I to separately write
libraries using the same error_t without having to get together and
negotiate who owns what integers in K..2^N-1.

That's why I think that the address of a string constant works out
better than densely-packed integers:  the linker will give each error
(represented as a string constant) it's own address (in each
executable or process).


Second: I don't by this "the caller knows the library the failing
function belongs to."   The caller might have been calling through an
arbitrary function pointer, for example.

With the current error_t proposal, the caller can:

        1) Handle a set of error_t values it knows about, using == 
          to recognize when one of those comes up.

        2) Handle all other error_t values in a single way.

The pattern matching question I have is should there be a third
option?   For example, I might have two specific errors:

        err_landshark_candygram
        err_conehead_beerdrinking

and the caller doesn't know about either of those specifically (so the
caller can't do an == test).

On the other hand, maybe the caller knows about:

        err_lame_saturday_night_live_references

of which both `landshark_candygram' and `conehead_beerdrinking' are
"instances" in some sense.   Should the caller be able to ask:

        if (is_saturday_night_live_error (&err))
          ...

?

If so, what's the right structure there?  Is it a single-inheritence
class hierarchy?   A multiple-inheritence class hierarchy?   Something
entirely different from either?

I don't have any really compelling, only fairly contrived ideas of why
I'd want some "bigger structure" for errors.  On the other hand, some
"bigger structure" is a very popular idea -- so, am I missing
something?




    > This can be combined with an error logging mechanism to provide the
    > ability to follow a sequence of correlated error messages for
    > diagnostic purpose. The UDI project offers a nice solution: a status
    > code consists of a 32 bits value split in a 16 bits correlation
    > number, one bit to differentiate between common errors and UDI
    > meta-language (library|module|domain|...) specific errors, and a 15
    > bits field for the actual error code.

    > The idea is that whenever an error occurs and a message is logged, the
    > logging function receives a reference to the status code and, iff the
    > correlation number is 0, it assigns the next available correlation
    > number to the status code. Then, it logs the message along with the
    > correlation number. Non-logging code is allowed to change the error
    > code but not the correlation number (except when initializing a newly
    > declared status code variable).

That's interesting but the idea of a "correlation code" is really
specific to only some applications, right?  That's the problem I have
with anything other than == for error codes: that any additional
structure seems to be domain specific; functions that want that
additional structure should return it separately from the error code
or as a parameter to the error.

(I really don't follow your example well though, not knowing much of
anything about UDI.)


    > [...]
    > 
    > >From a previous post:
    > 
    >     Tom>  bar (error_t * caller_err, ...)
    >     Tom>         {
    >     Tom>           error_t err = 0;
    > 
    >     Tom>           foo (&err, ...);
    > 
    >     Tom>           if (is_error_i_handle (err))
    >     Tom>             handle it;
    >     Tom>           else if (is_error_i_forward)
    >     Tom>             forward_error (caller_err, err);
    >     Tom>           else
    >     Tom>             invariant (!err);
    >     Tom>         }
    > 
    >     Tom> (With the understanding that `forward_err', if passed a NULL 
first 
    >     Tom> argument, turns into an abort.)

    > If bar() is a generic usage library function then (1) it should not
    > decide by itself to just abort in presence of an error and (2) it
    > should not handle and hide most of the errors (cf. "screen saver" vs
    > "nuclear power plant" or "caching memory allocator" examples). 

I have to disagree with both points in the general case, though not
for specific libraries.

If bar gets some error that indicates its private state has been
corrupted -- it is forced to break promises to its callers -- then an
orderly abort is the only option left.  The "else invariant ()" clause
there indicates that `bar' got an error that violates the supposed
contracts it had with `foo'.   `bar' can no longer be certain that its
internal state has been corrupted.   The `abort' is often the right
thing.   An alternative approach would be:

        bar (...)
        {
          if (is_error_i_handle (...))
            ...
          else if (is_error_i_forward (...))
            ...
          else
            forward_error (caller_err, 
                           wrap_as_fatal_error (err));
        }
        
That's often just as good.   But after I return a fatal error, my
caller better not call back into `bar'.

For _some_ (many, even) instances of bar, `is_error_i_forward (err)'
is defined to always return 1 -- the abort will never happen.  With my
`bar' pseudo-code, I was just trying to show the general case, from
which many simpler specific cases are derived (by things like
dead-code elimination).

    > So this basically leaves us with:

In a subset of possible `bar's:

    >   bar (error_t * caller_err, ...)
    >         {
    >           error_t err = 0;
    > 
    >           foo (&err, ...);
    > 
    >           if (caller_err)
    >             forward_error (caller_err, err);
    >           else
    >             abort();
    >         }


You know, `bar' is sometimes going to want to handle the darn error
itself.  But, anyway:


    > But foo() also respects the error protocol so bar() can just pass
    > caller_err to foo(), 

That's a special case.   It's the special case where
`is_error_i_handle' always returns 0 and `is_error_i_forward' always
returns 1.   And, yes, it's a nice syntactic property of the error_t 
protocol that in that case I can just write:

        bar (error_t * caller_err, ...)
        {
          foo (caller_err, ...);
        }

I do that all the time in simple code.



    > and so on. Such functions only needs to check
    > error returned from nested calls in order to cleanup locally allocated
    > resource and propagate the error upwards. A generic usage pattern
    > would look like
    > 
    >         bool error(error_t * err) { return err && err->code; }
    > 
    >         bar (error_t * err, ...)
    >         {
    >           ...
    >           foo1 (err);
    >           if (error (err))
    >             goto foo1_cleanup;
    >           foo2 (err);
    >           if (error (err)) {
    >             goto foo2_cleanup;
    >             log_error (err, "libzebu::bar() failed to foo2");
    >           }
    >           ret = baz ();
    >           if (ret == -1) {
    >             if (err) {
    >               set_error (err, BAZERR /* not bizarre! */);
    >               log_error (err, "libzebu::bar(): foobar in baz");
    >               goto baz_cleanup;
    >             } else
    >               abort();
    >           }
    >           ...
    >           goto success;
    >         baz_cleanup:
    >           ...
    >         foo2_cleanup:
    >           ...
    >         success:
    >           return ...;
    >         }


I used to write a lot of code like that.  Nowadays I only write a only
a little bit of code like that.  Lemme let you in on the trick
(although perhaps you're just showing the "general case" and already
know this trick :-):

When you can, don't have `foo1_cleanup', `foo2_cleanup',
`baz_cleanup', etc ad nauseum.   That gets too hard to maintain real
fast.  

Instead, just have `cleanup' which all of those gotos can use.

To use the trick, you have to make sure that all of your variables are
initialized and that all of your cleanups are robust no matter what
part of the code they are reached from.

That's why in libarch, for example, if a variable can hold a malloced
value, I always initialize it to 0:

        char * foo = 0;

and why the libhackerlab free routines promise to not freak out if
asked to free NULL.


    > I also think that "is_error_i_handle (err)" would not in practice be
    > very useful. You mentioned violation of abstractions barriers in a
    > previous post. For a caller to be able to know whether it is good for
    > him that a callee handles an error or not implies that the caller
    > knows about the internals of nested functions in the call path. This
    > seems unrealistic and to imply a violation of abstraction barriers.

That's silly.    Consider:


        funky_graph_algorithm (digraph_description)
        {
          forest = topolgically_sort (digraph_description);

          if (!error)
            use_the_fast_funky_algorithm (forest);
          else if (error == contains_cycles)
            use_the_slow_funky_algorithm (digraph_description);
          else
            forward_error_to_caller;
        }


(Where it's understood that computing `contains_cycles' doesn't 
 cost any less than `topolgically_sort'.)

-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]