Re: If QNX is successful, why NOT GNU Microkernels

help-hurd
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: If QNX is successful, why NOT GNU Microkernels

From:	Marcus Brinkmann
Subject:	Re: If QNX is successful, why NOT GNU Microkernels
Date:	Fri, 23 Jan 2004 10:41:51 +0100
User-agent:	Mutt/1.5.4i
On Thu, Jan 22, 2004 at 05:23:20AM +0100, Olivier Galibert wrote:
> On Thu, Jan 22, 2004 at 01:10:43AM +0100, Marcus Brinkmann wrote:
> > You have to put it into the perspective of OS design history.  The success
> > of Mach is to show that splitting up a kernel in a generic core part and a
> > specific server component (or multiple) is possible.  It's only a first
> > step in a new branch of the still short history of OS design.
> 
> Sure.  It was interesting at the time, no doubt about it.  But it
> stopped evolving 10 years ago, and OS engineering didn't.  And, as you
> very well know, having in 2004 a multiserver OS based on a 1990 mach
> version, and 1990 mach concepts, just doesn't work out.

Well, but we don't stop at Mach, which is the important part of my message. 
If all you are saying boils down to "Mach is not the answer" then everybody
is agreeing and there is no arguement.  However, I would much rather prefer
to talk about solutions than beating for the n-th time on the obvious
problems.  There are subtle problems that are interesting to discuss as
well, but they go way beyond the Mach sucks treatment (and here I am using
the word as the same shortcut as you did :).
 
> > If you want to
> > analyze for example the IPC performance, you will find out that a lot of
> > research was necessary to find out exactly _why_ it sucked and what can be
> > done to fix it (and it _can_ be fixed).
> 
> Is the why anything else than the cost of the minimum of two VM
> switches per exchange with a server plus the small amount of inter-VM
> copying needed?  Which is a large, incompressible cost in the ports
> model.  The marshalling/unmarshalling cost on top is just gravy.  What
> were they smoking?

You are still looking at it from todays point of view.  The achievement was
with the abstractions they made, and today we know that the abstractions as
they were done contributed a huge part to the cost.  The achievement of
groups like the L4 group is to show exactly where the problems lie and what
can be done about them.  This is an on-going process.

> Fixing it must be fun though, I'll have to look up the proposed
> solutions.  Comparing with the linux syscall speeds will be _tough_.

It is fun indeed.  You will find solutions comparable to Linux syscalls,
however, I want to add that Linux syscalls are calls from user land into
static trusted system code.  This is in particular not what we need in the
typical Hurd scenario, where we have calls from untrusted user clients to
untrusted user servers, which can change in a very dynamic way.  So, Linux
syscalls are fast because they have a very restricted application.

Still, you will find that you can do fast calls from user land into static
system code, for example with small address spaces, which avoids touching
the page tables.  You will also find other scenarios covered by L4, with
some additional cost, but still very efficient.

> > I don't say that the burden is on
> > you to do this.  You can very well lean back and watch if it happens, and
> > until then you can feel comfortable knowing about the superiority of
> > existing traditional solutions.
> 
> The solutions I consider superior are hardly considered traditional.
> The trend for thread/process unification is reasonably recent, and the
> recognition of the value of a clone-like syscall where you create a
> new scheduler object and cherry-pick what you want to share isn't very
> old either.  Or the futex approach to locking.  Or the O(1)
> scheduling.  Or the full memory cache unification.  They appeared a
> while ago in various oses (BSD, plan9...), but they're understood as
> the Right Way[tm] to do things for max 5 years.

None of these strategies really touch the fundamental traditional design
decisions of these systems.  They are even built around the underlying
assumptions.

> > However, this is not a contest.  We are not in competition with anyone in
> > terms of performance, security, etc.  There is no price to win if we are
> > better in any category.
> 
> What you win, or nowadays lose, is volunteer developpers.  Imagine
> someone new who wants to play with OSes and trying to find out what
> Hurd should eventually have that the others don't have.  The gnu.org
> Hurd page is pityful in that aspect, all the features it cites are
> covered by current monolithic kernels.

If you want to help in this area, please write your ideas to
web-hurd@gnu.org.  The web page was written by me a long time ago, where I
was mostly ignorant of microkernel issues.  It is still an accurate
description of what the existing Hurd on Mach code base is about.  It is
difficult to write about an existing code base and a projective code base
without leading to easy confusions, and in addition to that I hesitate to
write prominently about something that does not really exist yet.
I don't even have as much time for hacking as I want, I am sure not going to
spend it on making fancy web sites about what I want to hack.

> The Hurd on L4 page boils down
> to "L4 is better than Mach" (well _duh_), and half its "Related item"
> links are dead.

The Hurd on L4 page you probably mean is nothing that the people working on
this Hurd to L4 port associate with.  I would like to see it being deleted.
Maybe I should ask its creator, as it seems to cause some confusion.

> If the developper digs deeper, the only thing he'll find is the
> translators, which can be seen as a combination of triggers (soon
> coming as "mount traps" on linux, only missing the persistence at
> filesystem level) and userland filesystems (already existing).  And
> that's it.  Nothing else potentially interesting is ever cited.

They can be seen as such things, but it would be wrong to see them as solely
such.  If you see them as such things you are basically seeing only what you
want to see, and ignore the rest.  Which is an easy mistake to make.

It's actually pretty simple.  You should know that _any_ particular feature
that can be implemented in an operating system A can be implemented in
another operating system B.  We can not make a random access machine with
limited memory something other than a random access machine with limited
memory.  In particular, it is pretty easy to implement
any limited feature in a monlithic kernel as an add-on.  One example is the
user fs you mentioned.  Yes, it is a user fs feature, but no, it is not
something you will see going mainstream and be available to all users on all
Linux installations.  So, here is my answer: First of all, making such a
comparison is misleading.  Second, this is not about "we have feature X that
nobody else has" (ie, ours is bigger than yours), but about exploring
fundamental assumptions behind operating system design, and possibly
changing them.

I also want to add that many new OS ideas like user fs are going to be
implemented on Linux first because people working on such things don't have
much alternatives.  Linux is there, it works, and it relieves them from
thinking about anything else but the particular issue they want to be
working on.  However, I have talked to people about this and I most of them
actually wished they had an alternative that doesn't superimpose the
monlithic model and its problem on them (problems like latency or permission
issues).

> So, well, if it's only to redo what already exists and with less
> hardware support, he may not be very motivated...

Well, if that is all he sees he then he is far from being able to contribute
at this stage.  We are working on advocacy issues, but it is very difficult,
as this thread shows, and we only have so and so much time.  Several people,
like Neal, Wolfgang and me, have attended many a conferences and spoken to
many people, and with great success.  However, it always takes a great
effort.  Maybe we should hire a PR firm to bring the complex technical
details down to three-word slogans ;)

As a side note:
> Zero-copy to/from userspace is so not dream land that
> it's already supported in linux.

It's not supported by any multiserver microkernel based OS as far as I am
aware of, and that has a reason.  I am mentioning this because it is a good
example that shows that also we do not want to compare ourselves to Linux
etc at this time, we are aware of the precedence and standards these systems
set as far as performance (and some other things) are concerned.  We do not
want people to convince of lesser standards (although we might have to make
trade-offs like "if you use this feature here, you will gain X and lose Y").
Rather, we want to reach these standards and go beyond them.  However, this
is a tricky and delicate path to go.  We must however be focused.  Criticism
like "we don't support the latest sound blaster card" will be silently
ignored :) as it is rather uninteresting for what we do.

Thanks,
Marcus


-- 
`Rhubarb is no Egyptian god.' GNU      http://www.gnu.org    marcus@gnu.org
Marcus Brinkmann              The Hurd http://www.gnu.org/software/hurd/
Marcus.Brinkmann@ruhr-uni-bochum.de
http://www.marcus-brinkmann.de/
[Prev in Thread]
Current Thread
[Next in Thread]
Re: If QNX is successful, why NOT GNU Microkernels, (continued)
Prev by Date: Re: If QNX is successful, why NOT GNU Microkernels
Next by Date: The Hurd at eGovOS 2004?
Previous by thread: Re: If QNX is successful, why NOT GNU Microkernels
Next by thread: Re: If QNX is successful, why NOT GNU Microkernels
Index(es):
- Date
- Thread