[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: malloc() patches round 3

From: Thomas Bushnell, BSG
Subject: Re: malloc() patches round 3
Date: 22 Aug 2001 19:10:24 -0700
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7

Igor Khavkine <i_khavki@alcor.concordia.ca> writes:

> If third party programs misbehave when faced with resource shortages
> that's their problem. The important thing to do is to have some sort
> of fixed-resource "way out" implemented by the kernel/servers/native
> utils that can get you out of a sticky situation when things go wrong.
> That's very similar to a statically linked shell for root in case
> the dynamic linker stops working, or the 5% of any partition reserved
> for the super user.

The reason for reserving space for the superuser is *not* for some
kind of safety.  If you want that, you want quotas.  The reason for
that reservation is that the fast file system algorithms become
markedly inefficient if the disk is over 90% full.  That number was
the subject of careful research.  Ext2 says 95%, but as far as I know,
that number was just pulled out of nowhere.

My systems have a gajillion servers on them.  If any of them
mysteriously stopped working, it's a major annoyance.  It's not easy
to even notice there's a problem.  It's a very common situation for
programs to fail mysteriously when memory is unavailable.

And my *point* here is that this is not a matter of those programs
being sloppy about malloc.  It is *inherent* in the problem that there
is no easy solution to failing gracefully when memory is used up in a
Unix-like system.  The kinds of mechanisms needed to make the system
fail gracefully are global resource management mechanisms.

> My idea is for anything that
> acts in a supporting role (libraries, system calls, servers, etc.) not
> to fail of their own volition unless they know that ABSOLUTELY NOTHING
> else can be done. All other errors should be propagated to the program
> or client that made the library/system/RPC call. This sort of "support
> code" should not take upon itself to handle error conditions that do
> not give users of this "support code" the freedom to handle it the
> way they want.

The problem is that I would *rather* have my kernel crash than have
exim or apache start handing out error codes.  You're assuming that
telling the user "memory is full" might fix the problem.  That means
you're assuming that the user even knows what computer's memory is
full!  Not so.  

I've had weird exim failures cause my incoming email to get hosed.  It
was a serious problem, and it could have been avoided if the system
had simply rebooted.  It doesn't matter to me whether exim is hosed
because it hands out "memory exhausted" errors to incoming SMTP
connections, or if it responds in some other incomprehensible way:
*BOTH* are just as wrong.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]