Re: Part 2: System Structure

l4-hurd
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Part 2: System Structure

From:	Bas Wijnen
Subject:	Re: Part 2: System Structure
Date:	Thu, 18 May 2006 13:24:01 +0200
User-agent:	Mutt/1.5.11+cvs20060403
On Wed, May 17, 2006 at 11:46:21AM -0400, Jonathan S. Shapiro wrote:
> On Wed, 2006-05-17 at 17:20 +0200, Bas Wijnen wrote:
> 
> > Well, perhaps it is all much easier if the program cooperates in being
> > debugged.  But with a transparent space bank, it is at least possible to
> > do it with a program that doesn't cooperate.  If I understand you
> > correctly, this is not the case with a malicious program which was
> > originally set up to be debuggable.
> 
> It is not. The authority to *read* a program image does not in any way
> convey the authority to *write* the program image (e.g. to insert BPT),
> nor to halt the process, nor to alter the process registers.

The question was: Suppose I download a binary from the internet.  I want to
examine it.  I put it in a constructor with some glue code to set it up for
debugging.  I start the program.  First the glue code will hand me some
capabilities.  Then the untrusted code starts running.  Is it possible that
the untrusted code revokes my right to debug it?  I think this would at least
be possible by building a new constructor, placing a bunch of code in it, and
run it.  Of course there are workarounds (after all, I have the code and can
disassemble and change it).  But I would prefer a system where this is not
possible at all.

> > Ehm, ok.  Now I don't understand you at all anymore.  I thought the proble
> > of dynamic libraries was that the code can be different when it's run than
> > when it was tested.  That isn't the case for static libraries.  But now
> > the problem is that the code is prepared at all?
> 
> We are definitely talking past each other.
> 
> The problem is that you test the code in one environment, and execute it
> in another. If critical code is isolated in a separate address space,
> then the *only* way it can be invoked is through its intended interface.
> Stray pointers and/or changes to global variables that are intended to
> be private to the implementation cannot happen.

Sure.  But does this mean every piece of critical code should be in its own
address space?  This is about recovery boundaries, I suppose.  What I'm saying
is that the parent and the child fail as a unit (this is unidirectional, the
child and parent don't fail as a unit).  Putting the startup code for the
child in its own address space will only help the child survive when the
parent blows up.  But in that case the child doesn't have to survive, because
it has become useless.

> A direct consequence is that no isolation of faults exists in principle.
> If exec() screws up, you have no idea whether the problem lies in exec()
> or in some weird behavior of the program.

When programming with untrusted helpers, the idea is of course that the parent
is perfect.  That includes exec().  So if the child is doing weird things, we
assume that the problem is with the child.  Errors in exec() will seem to be
errors in the child in some cases.  I agree that this class of bugs may be
hard to find.  I disagree that we should "protect" ourselves from it.  This
class of bugs are still possible with a shielded exec(), and they will still
be hard to find.

> So my argument is: if you want things to be robust, a key step is to
> isolate them from the errors of other parts of the system.

The parent is isolated from errors in the child.  The child doesn't need to be
isolated from errors in the parent, because it is useless without the parent
(so if the parent fails, the child should die).

> > > Your statement that "preparing" is sufficient is completely contrary to
> > > the entire history of system building experience since the beginning of
> > > computer science. The plain fact is that stray pointers exist, and there
> > > is absolutely no amount of preparation that can protect you from this
> > > error if you are in a single address space.
> > 
> > Ah, so you say the problem is that a stray pointer of the parent could
> > mess up the child?
> 
> This isn't a parent/child thing. You started this discussion by saying
> that you were planning to put the constructor functionality into a
> library. The concern here is the complete failure of isolation between
> the application code and the library code/data.

There are lots of things that are in the same address space even if things
would be more robust (in a non-practical way) when they weren't.  For things
which fail as a unit, this is not a problem.  If xmms crashes, there's no
point at all in protecting its plugin.  However, if the plugin crashes, it is
useful to protect xmms.  This is exactly how things work in my proposal.

> > If the parent [or any application] has a stray pointer, then I'd say it's
> > failing.
> 
> Of course it is failing. That isn't the question. The question is:
> 
>   You have a million line application with a stray pointer error
>   somewhere that manifests only when the moon is full, it is raining
>   in shanghai, and you tap the power cord with your left foot at just
>   the right frequency.
> 
>   How the @)_&% do you find it?

By debugging it when you see that it seems to happen.  Which means the memory
must have been transparent (for reading *and* writing) for you to begin with.
;-)

However, this doesn't answer your question of course. :-)  I'm saying that the
stray pointer problem doesn't go away when you isolate exec() into its own
address space.  It will still be a problem, and it will still need to be
solved.

> >   It's just as likely that it messes up itself, or gets a segmentation
> >   fault.
> 
> The empirical evidence is that this really isn't true. A very large
> number of stray pointers result from uninitialized data, and there is a
> substantial likelihood that the old value at that location is a pointer
> to something valid. In consequence, many stray pointer errors do NOT
> result in SEGV.

Ok, sorry.  It's much more likely that it messes up itself and/or gets a
segmentation fault.  As you say, stray pointers usually point to data which
was in use before.  The program has no reason to point its pointers to the
library code, so this will be something that happens less than average.

> So I am very suspicious of arguments of the form "X is just as likely as
> Y" without hard data and a good causal story for why the outcomes are
> indeed just as likely.

You are correct, of course.  But there's really no reason that stray pointers
would be pointing at library code/data.  You are in fact solving a problem
that is too rare, IMO.

> > If program A wants to spawn an instance of program B, it invokes a
> > capability to the B-constructor.  The B-constructor then builds B.  When
> > all this has happened, A and B are both children of their own constructor,
> > which are children of the meta-constructor.  In particular, B is not a
> > child of A.  So B has successfully guarded its runtime memory image from A
> > (which with these definitions isn't the parent, but I think it would be in
> > the situation you were envisioning).
> 
> No, because B was built using storage provided by A, so B is transparent
> to A, so B cannot guard its content against A.

No, it wasn't.  And so it can.  This does indeed limit some things, because if
you want your program to be opaque from its callers, then you need to pay for
the storage.  That limits your ability to do certain things.  The question is
if any important things are left out.  I'm still not sure about this.  I do
know that if there are, then I still want transparent storage (at least to the
user, possibly not to the parent process) to be the default.

> > > > > If opaqueness is the default, and you leave out the "make it
> > > > > visible" option, your program tends to break very quickly and you
> > > > > notice it.
> > > > 
> > > > Not at all.  In my use case above, it took 3 years (which obviously
> > > > was an arbitrary "long time").
> > > 
> > > If you really managed to go 3 years without noticing that you weren't
> > > transparent, then you didn't need to be transparent.
> > 
> > So how do you want me to debug this program?  By just looking at the code,
> > trying to find a place which might be corrupted over time?  That just
> > isn't going to work.  Agreed, with a debugger it might not work either.
> > But chances are much better.
> 
> Two answers:
> 
> In production, if the program must guard it's content, the whole point
> is that you MUST NOT be able to debug it without special privilege.

It's not the program that guards.  It's the user.  And he shouldn't guard it
from himself, but from other programs.  It is *very useful* to be able to
debug programs.  It is just important that other programs (or users) cannot do
it.  I think there is no problem if the parent process can, but that's an
implementation detail.

> In development, or for programs that are not sensitive, you add a bit of
> protocol to the program:
> 
>       object->GetDebuggingCapability()
> 
> or you modify the constructor protocol to return this in the first
> place.

As I pointed out, Jack was not doing development.  He's just running my
program, and after 3 years I suddenly want to be able to debug it.  That
should be possible (with Jack's consent, of course).

> > > > We can let the user provide any storage we need, except for parts
> > > > which must not be disclosed to them.  Obviously a buffer for an IPC
> > > > transfer may be disclosed to the user.
> > > 
> > > No, you cannot. Any storage supplied by the user is destroyable by the
> > > user, and the service must then guard itself against the possibility of
> > > memory faults. From a strictly pragmatic perspective this is not really
> > > feasible in the general case. There are a few restricted idioms where
> > > this is survivable, but they are *very* restricted for reasons of
> > > pragmatics.
> > 
> > And how is this different if the user would be able to give out (revocable)
> > opaque memory?
> 
> In the opaque memory case the entire design pattern is different.
> Instead of providing storage to a pre-existing process, the user
> instantiates an entire sub-process using storage provided by the user.
> The user can destroy the storage, but this destroys the entire
> sub-process.
> 
> The problem here isn't the destruction of storage per se. It is the fact
> that the destruction of storage used by the child wasn't "all or
> nothing".

I don't see how this would work in practice.  Say we have a filesystem server.
The user wants to read some file.  You're saying the user will instantiate the
whole filesystem for that?  While possible, things go wrong when writing.  If
the operation fails half-way, the disk is corrupt.  If two users both write
the same file, there're race conditions, possibly corrupting the disk.  With
journalling and locking this may be solvable, but I think it's much easier
(and therefore less bug-sensitive) with a single filesystem server, taking
requests.

Or is this a very rare example where you do want to protect against partial
memory failure?

> > > I have an enormous number of examples where private storage is
> > > essential. All of them have been rejected here (without, so far as I can
> > > tell, any semblance of serious consideration) on grounds of ideology.
> > 
> > I don't remember it as such, but will look back in the archives.
> 
> If you go back and look at my use cases for confinement, you will find
> that Marcus read all of them with excessive haste, misunderstood them
> completely, and then dismissed them as "not Hurd objectives" without
> (apparently) understanding them.

I found the electronic money and the hospital.  As far as I can see, there was
indeed a lot of misunderstanding, and that went so far off topic that they
weren't answered at all.  I have some thoughts about them:

Money:  First of all, I didn't read the entire article you referenced
(actually I hardly read it at all), so I'll summarize the important parts that
I expect to be in it.  These may be incorrect, in which case my conclusion
probably doesn't make sense. ;-)

Assumption:
For electronic money, there are three parties involved: A client C, a provider
P, and a bank B.  C wants P to do something, and wants to pay for it.  The
price is an amount of A.  B is trusted by C and P, and has a protected purse
for both.

The thing should work as follows (A->B:foo means A sends a message to B saying
foo)

C->P: I want a service.
P->C: That costs you A.
C->B: Allow P to reserve A after the service has been provided.
C->P: I'll pay after you deliver.  B can tell you I have the money.
P->B: C says he has A.  Please reserve this for me.
B->P: C indeed has A.  This has been reserved.
B->C: A has been reserved for P.

P provides the service.

[CP]->B: the service has been provided.

B moves the money from the purse of C to the purse of P.

B->C: you have paid.
B->P: C has paid.

Ok.  Now the problem you seem to see is that the storage of the purse will be
paid for by B, not C.  Technically, this is correct.  However in practice, I
don't think it is a problem.

First of all, B is a bank.  It provides services for money.  It can easily ask
some money for the service of providing the storage.

Second, while encryption can reduce the problem to a local one, it doesn't
actually move the memory.  The bank will want to keep the purses on its own
hard disks in its own machines, not protected through encryption on the
machines of the customers.  So in the distributed case (which is the only
realistic one, I think), the client simply _can't_ provide the storage (except
by donating a physical hard drive to the bank, but paying money for it is much
easier).

Finally, since this was appearantly a revolutionary proposal, I'm likely
missing the point (what I described was pretty old, as far as I know).  I'd
prefer not to have to read the entire article, but please do explain shortly
how it works (or point me to a summary) if this is indeed the case.


The hospital case is very special.  There is a database which runs on system
storage.  It will take requests from the client.  All this does not happen on
a general purpose computer.  It would be wise to have a dedicated system, but
I agree with you that this is not what will happen.  However, these machines
will also not be used for programming and playing games.  There are certain
things which are allowed, the programs for that are installed, and that's it.

The problem you have with it is that the client must be the only way to access
the database.  That means the client is in fact the interface, provided by the
database.  This in turn means that it must be provided by the system as a
service, not run by the user.

In our proposal, this means that it must run on system storage.  This is
perhaps conceptually wrong, but it is no problem at all in practice.  On these
machines, there is at most one such client running at a time.  Or, in case of
a multi-user system, at most one per user.  Since user accounts cannot be
created without intervention, this means it runs in constant (where this
constant is a function of the number of users ;-) ) storage in practice
(assuming that it doesn't need dynamic buffers, which is doable for the
programmer).

> > I think you're making this bigger than it is.  All we propose is not to
> > implement something that most current systems don't have either.  I don't
> > think we should be forced to implement such a thing.
> 
> I disagree with this technical description.
> 
> In UNIX, it is possible for me to execute a setuid program that I cannot
> debug or inspect.
> 
> So it is apparent that analogous mechanisms exist. This is not as new as
> you want to make it.

However, in UNIX this setuid program runs on durable resources, provided by
the system.  The user is not in control of that program.

In the Hurd on Mach, this is implemented by starting the program from the file
system.  This does exactly what I describe, including that it is debuggable by
its parent (the file system).  Also, the instantiator doesn't pay for the
storage (or any other resources, for that matter).

There is in fact one feature in UNIX which does implement this.  It is the
ability to make a program executable, but not readable.  This is not possible
for scripts for technical reasons.  If it would have been used much, a
solution would have been found for that.  The fact is that it is hardly ever
used, and when it is it is likely a mistake.  Noone will miss it.

Thanks,
Bas

-- 
I encourage people to send encrypted e-mail (see http://www.gnupg.org).
If you have problems reading my e-mail, use a better reader.
Please send the central message of e-mails as plain text
   in the message body, not as HTML and definitely not as MS Word.
Please do not use the MS Word format for attachments either.
For more information, see http://129.125.47.90/e-mail.html
signature.asc
Description: Digital signature
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Part 2: System Structure, (continued)
Prev by Date: Re: Part 2: System Structure
Next by Date: Re: Part 2: System Structure
Previous by thread: Re: Part 2: System Structure
Next by thread: Re: Part 2: System Structure
Index(es):
- Date
- Thread