l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Persistence Pros and Cons


From: Martin Schoenbeck
Subject: Re: Persistence Pros and Cons
Date: Tue, 08 Nov 2005 23:24:51 +0100
User-agent: Mozilla Thunderbird 1.0.6 (Windows/20050716)

Hi,

Jonathan S. Shapiro wrote:
Yes. I also wanted to respond to this point.

If a program is hung, you should be able to kill it. Rebooting is kind
of like using a nuclear bomb to open the bedroom door when it is humid
and the wood has swelled very slightly.

This, of course, is not true for critical system programs at the level
of things like the storage allocator. In EROS (and, I suspect, in Eumel;
can you comment, Martin?) these programs are extremely simple and VERY
carefully designed. Fortunately there are very few programs where this
degree of care is required.

That's exactly the point. The storage allocator of EUMEL really was very
simple. I read (and could understand) it, as an example of good assembly
code and I learned z80 assembly programming by that. The L3 version is
written in CDL2, it is not as simple, mainly due to some optimizations
for the garbage collection process.

<snip>

I just ran my lines of code tool on parts of the linux-2.6.12.4 source
tree. The Linux fs/ext3 directory alone contains 10,447 NCLOCS, and this
code does NOT include argument processing. fs/reiserfs is 18,534 lines.
Hell, guys, the entire EROS *microkernel* is only 21,563 NCLOCS
(excluding kernel debugger).

For comparison, the EROS file server implementation contains 498 NCLOCS
(non-comment, non-blank lines of code). The directory implementation is
another 418 NCLOCS. These counts **include** the IPC message processing.
They do no synchronous I/O, and therefore have no need to be
multithreaded. I would bet that the Eumel file system is not very much
different in its complexity.

I'd have to dig too deep, to get at EUMEL-sources, but I can tell about
L3. I think, several readers are not familiar with the memory model of L3, so I'll try to sketch it. The basic concept is the dataspace, a container of data, in L3 of exactly 1GB, which may be sparse allocated. All permanent data is stored in dataspaces. User data, kernel data as TCBs and the management data for the dataspaces themselves, disk usage tables etc. Dataspaces are copied lazily, so they simply get a writeprotection and any page written then will be assigned a new disk block before writes are allowed. To write a checkpoint, simply (ok, not _so_ simpel) the highest level (the dataspace of dataspaces) is copied lazily. After writing back all dirty pages, the old copy is set as a valid checkpoint. Because all other data is beyound that dataspace of dataspaces and any write to a page (or pagetable) causes all lower level elements to be copied lazily. The persistence therefore comes with the lazy copying 'for nothing'. Files as seen by the user are name dataspaces. Dataspaces may be sent by IPC, which does not copy them, but moves them into the space of the receiver.

The code implementing the dataspaces is about 20,000 lines of CDL2, but I have no possibility, to count NCLOCS. It contains also code to write all used blocks to a secondary storage and to read a complete system from secondary storage and several dump and debugging support. Without that I guess, there are 5,000 to 8,000 NCLOCS, but's only a guess.

The code, implementing the users view as named files is about 2,500 lines ELAN, which tends to having no comments, but many lines of code, which work as comments. I guess no more than 500 to 1,000 lines actually generate code.

Antrik: earlier I was a bit impatient with your assumptions about the
complexity of orthogonal persistence, and I probably should apologize.
Now that you see the actual numbers and some of their consequences, does
the argument in favor of persistence at least seem a little clearer? In
my opinion, if this result was the *only* benefit of persistence, it
would probably be worthwhile.

Martin: how are similar things handled in L3/Eumel, and do you see
similar complexity reduction in these very critical pieces of code?

I hope, I could put light on it above. But I think, especially after some maintenance at the dataspace management of L3, there was room for enhancements. While Jochen used the L3 V3 kernel as a sort of testbed for several L4 features, I'd have appreciated, if he had done this with the exclusion of this part, too. As you do, Jochen was convinced, that design, which follows 'good' principles, leads to good solutions nearly automatically. The principle of excluding anything from the kernel, which could be done outside, was one of them. After he had done some research witch virtual pagers in L3, he decided to leave out the paging mechanism from L4 (no, I don't believe him, when he writes, that L4 was not designed with persistence in mind ;-)). I know (ok, not in a scientific sense of the word), that having the dataspace implementation and therefore the persistence implementation in user level tasks, as it is proposed in 'Transparent Orthogonal Checkpointing Through User-Level Pagers' (Espen Skoglund, Christian Ceelen, and Jochen Liedtke), had given us some headaches less. Due to higher flexibility and less complexity.

Martin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]