[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Journalling filesystems

From: Marco Gerards
Subject: Re: Journalling filesystems
Date: Tue, 15 Jun 2004 23:18:35 +0200
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3 (gnu/linux)

Bas Wijnen <b.wijnen@phys.rug.nl> writes:

> Now what I mean is not ext3.  That is only part of it.  The journal of ext3
> makes sure that the filesystem itself cannot corrupt (as long as there is no
> hardware failure.)  What I mean is a system that makes sure the database of an
> application cannot corrupt.  This can be done using the same method as ext3
> uses at the lower level.  Only it is important that the journal and the data
> are not written in incorrect order.  Then the application can be sure that a
> write operation has either completely succeeded, or completely failed.

Even Linux does not have that yet.  Currently only reiser4 supports
that partially (or will support it).

This is a really important feature for a server os to have, IMHO.
Having transactions in the filesystem (that is what you are talking
about) is quite important, otherwise it will be implemented by the
user of the filesystem.  I guess you understand that this sucks. :)

It is important to have a single interface for this, we should have a
look at the work namesys did here.

But this is thinking too far ahead.  Having basic journaling support
now is more realistic (although very optimistic).

> For that, I think some extra communication between the filesystem driver and
> the application is required.  It could be implemented with "sync" calls, but
> it would be very inefficient to use them all the time, because three calls
> would be needed for every operation (one for the journal, one for the data,
> another one for the journal.)  Probably it would be much faster anyway to have
> an external journal on a different disk.  Those are details for now anyway.
> So to summarise what I mean, a program should:
> 1 - tell the filesystem it begins a journalled operation
> 2 - do all kinds of disk access
> 3 - tell the filesystem it is finished with its journalled operation
> 4 - get a reply from the filesystem that it worked

Yes.  That is how transactions work in databases.  You will need more
features, like a rollback for example.  Well, that is what I
think... we can better have a look at what the namesys(reiserfs)
people did first.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]