gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Nit


From: Robin Farine
Subject: Re: [Gnu-arch-users] Nit
Date: 22 Oct 2003 23:55:18 +0200
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

>>>>> "Tom" == Tom Lord <address@hidden> writes:

    Tom> First: I hope it's possible for you and I to separately write
    Tom> libraries using the same error_t without having to get together and
    Tom> negotiate who owns what integers in K..2^N-1.

The idea was that each library uses the whole set. If my app calls a
libgloglo function which returns K+2, it assumes that the well
established convention (*) is respected and that this error is defined
in gloglo/errors.h. Never mind, using unique pointers to whatever well
known error objects is probably better.

    Tom> Second: I don't by this "the caller knows the library the failing
    Tom> function belongs to."   The caller might have been calling through an
    Tom> arbitrary function pointer, for example.

(*) Convention: any callback function passed to a library must return
    common codes or codes from that library. If the library needs to
    pass the callback to another library, wrap the callback in an
    error code translator or rethink your design. 

No, just kidding. I won't fight for the virtues of integer error
codes.

    Tom> With the current error_t proposal, the caller can:

    Tom>        1) Handle a set of error_t values it knows about, using == 
    Tom>           to recognize when one of those comes up.

    Tom>        2) Handle all other error_t values in a single way.

I think that it is a good starting point. If the error API does not
commit the type error_t to a string pointer, but defines a small set
of functions to manipulate errors as a quasi opaque type, then it
keeps the door open for further extensions when a concrete case arises
that justifies it.

Probably that some preprocessor tricks would be needed to define error
instances in order to keep source code compatibility. Something like
DEFINE_ERROR(<name>, <string>) which would initially just expand to

        error_struct <name>_struct = { .text = <string>; };
        error_t <name> = &<name>_struct;

    Tom> If so, what's the right structure there? Is it a
    Tom> single-inheritence class hierarchy? A multiple-inheritence
    Tom> class hierarchy? Something entirely different from either?

Multiple inheritance sounds a bit overkill. Assuming that error_t is
quasi opaque, at least single inheritance can be added later by adding
a new 'parent' field to the error structure, a new
DEFINE_SUB_ERROR(<name>, <string>, <parent>) macro and a function that
tests for the child<->parent relationship.

    Tom> I don't have any really compelling, only fairly contrived ideas of why
    Tom> I'd want some "bigger structure" for errors.  On the other hand, some
    Tom> "bigger structure" is a very popular idea -- so, am I missing
    Tom> something?

Difficult to tell at this level of abstraction.

    >> The idea is that whenever an error occurs and a message is logged, the
    >> logging function receives a reference to the status code and, iff the
    >> correlation number is 0, it assigns the next available correlation
    >> number to the status code. Then, it logs the message along with the
    >> correlation number.

    Tom> That's interesting but the idea of a "correlation code" is really
    Tom> specific to only some applications, right?

No, it just relies on support provided by the error type and the
log_error() function. If hackerlab defines both, then any client can
benefit from it. It permits a log analyzer tool -- on the logging
server for instance -- to reorder and group loghed messages using the
timestamps and the correlation values.

In cases where a call at the application level involving deeply nested
calls fails, if the top level function only produces an error message,
say "configuration not found", the user can hardly know what really
caused it. However, if some of the nested calls also log an error
message, particularly the most nested one, these messages can be
grouped together using the timestamps and correlation value to provide
the user with a kind of user friendly call trace. He would see
something like "cannot open ~/.gnafu" or "too many descriptors"
preceding the "configuration not found" message.

Thus, this does not help the runtime to recover but it does help users
to fix a lot of problems by themselves without diving into the source
code or posting to the mailing-list. In some way, it helps *you*.

    >> If bar() is a generic usage library function then (1) it should not
    >> decide by itself to just abort in presence of an error and (2) it
    >> should not handle and hide most of the errors (cf. "screen saver" vs
    >> "nuclear power plant" or "caching memory allocator" examples). 

    Tom> I have to disagree with both points in the general case, though not
    Tom> for specific libraries.

Yes, I agree that (2) does not apply to the general case, whether a
function handles most, some or no error at all depends on each concrete
case.

    Tom> If bar gets some error that indicates its private state has been
    Tom> corrupted -- it is forced to break promises to its callers -- then an
    Tom> orderly abort is the only option left.

Not necessarily. I would prefer it to let the callers do what they
need to do during call backtracking, which might be easier than having
a SIGABORT handler that takes care of everything. The application code
is then free to choose between an abort(), exit(), exec() or a even a
system reset.

I liked the idea that the caller decides that the callee is allowed to
call abort() by passing a null pointer as error handle. Also, if bar()
is critical, it can ensure to not be called again once its state has
been corrupted using preconditions/invariant.

    Tom> When you can, don't have `foo1_cleanup', `foo2_cleanup',
    Tom> `baz_cleanup', etc ad nauseum.   That gets too hard to maintain real
    Tom> fast.  

    Tom> Instead, just have `cleanup' which all of those gotos can use.

Sure. As you said, it was just a general illustration which works even
when dealing with aggregate types or with opaque types such as
regex_t.

[some silly stuff of my own]

    Tom> That's silly.    Consider:

Maybe ...

    Tom>        funky_graph_algorithm (digraph_description)
    Tom>         {
    Tom>           forest = topolgically_sort (digraph_description);

    Tom>           if (!error)
    Tom>             use_the_fast_funky_algorithm (forest);
    Tom>           else if (error == contains_cycles)
    Tom>             use_the_slow_funky_algorithm (digraph_description);
    Tom>           else
    Tom>             forward_error_to_caller;
    Tom>         }

Ah! I understand now that I mixed up things in your example. The part
that I don't quite agree with is the "is_error_i_forward(err)". I
already mentioned that in a paragraph above.

But just in case, if the caller does not want to deal with errors, it
passes a null error pointer. If it passes a non-null pointer, there
might be a reason for it that the implementor of the callee might not
have predicted.

-- 
rnf




reply via email to

[Prev in Thread] Current Thread [Next in Thread]