[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gluster-devel] Re; Load balancing ...
From: |
Krishna Srinivas |
Subject: |
Re: [Gluster-devel] Re; Load balancing ... |
Date: |
Tue, 29 Apr 2008 11:29:49 +0530 |
We did discuss about journaling translator. But implementation wise it gives
to a lot of complications.
* journal has to be maintained, it would require huge disk memory space.
* replay of the journaling will cause race conditions. (If we consider 2 or more
clients, each client writes to same offset)
A better solution would be to maintain a list of dirty blocks and use it during
selfheal.
Krishna
On Tue, Apr 29, 2008 at 6:11 AM, Gareth Bult <address@hidden> wrote:
> Hi,
>
> I must say I find the idea of a journal approach quite appealing, although
> the split brain problem is an issue .. that said AFR volumes already have a
> split-brain problem .. unplugging a network lead between two AFR sub-volumes
> is an easy demonstration of this .. both servers will assume the other is
> down and carry on .. would adding a journal make the issue any worse?
>
> (or am I missing something?)
>
> In terms of a real use-case, I've had lots of cluster issues relating to
> single nodes becoming unavailable for short periods. With the exception of
> "heartbeat" screwing up a DRBD setup (which was an internal software failure,
> rather than anything we would be looking to protect against) I've never
> experienced two nodes becoming isolated and potentially suffering from
> split-brain. (I accept it can/does happen, but I'm thinking it's not an
> everyday occurrence)
>
> So ... a journal would not be a perfect solution, however a very limited
> amount of split-brian protection might be considered a "pretty good" solution
> in-context and it would provide excellent recovery metrics in most cases.
>
> ??
>
> In terms of work, I'm guessing each write operation would need to put an
> additional (serial,path,offset,bytes,data) to the journal volume .. each data
> volume would need to keep track of it's most recent serial, then mount would
> need to check the journal and run playbacks for each sub-volume who's serial
> isn't up to the most recent in the journal serial ...
>
> If all this is done in a journal translator .. it doesn't "sound" too
> onerous or that it would involve changing any other code ... ??
>
> Gareth.
>
>
>
> ----- Original Message -----
> From: "Gordan Bobic" <address@hidden>
>
> To: "gluster-devel" <address@hidden>
> Sent: Monday, April 28, 2008 7:56:16 PM GMT +00:00 GMT Britain, Ireland,
> Portugal
> Subject: Re: [Gluster-devel] Re; Load balancing ...
>
>
>
> Martin Fick wrote:
>
> > May I suggest an alternate approach? The rsync model
> > seems like a nice one when you have no idea what the
> > changes are, but with the glusterfs AFR it is possible
> > to keep track of the changes. What about adding a
> > journaling volume option to the AFR translator?
>
> Sounds like you are effectively describing an extent based volume, very
> similar to what DRBD does to limit the amount of sync required.
>
> > So if changes cannot be written to Sub B they would
> > be recorded in Journal A. When B comes back up and
> > AFR notices a mismatch between a file on Sub A and Sub
> > B and would normally query Sub A for the file
> > contents, it could query Journal A first to see if the
> > changes to the file are stored there. If so, Journal
> > A could reply with just the changes instead of the
> > whole file and AFR can then apply the changes to Sub
> > B.
>
> Splitbrain handling of this would be impossible, and one version would
> always have to win. But other than that, I can see that would work.
>
> > The journal volume would not actually be required and
> > would be space limited, it would simply drop changes
> > that it can no longer keep track of. If the journal
> > does not have the change logged, everything would
> > proceed as it does today, the subvolume would be
> > queried for the whole file. This would be a little
> > like the DRBD model, but more inline with the gluster
> > way of doing things. It would be better than what
> > DRBD does since it would be more granular. When space
> > for changes runs out, whole files might have to be
> > synced, but not necessarily the whole filessytem!
>
> I think having an rsync type syncing algorithm that can operate on the
> whole file would be more flexible and potentially provide enough of an
> improvement to make the complication of adding journals/extents not
> worthwhile.
>
> > I realize that this a major enhancement, and would be
> > a lot of work, but then again, so probably would the
> > rsync model implementation, would it not?
>
> I haven't looked at the GlusterFS code (yet), but I would imagine that
> implementing rsync-like file sync would be _much_ less work than
> implementing extents/journals/undo logs.
>
> > The
> > advantage here is that consistency would be assured.
>
> That is arguably fairly academic. Just use the rolling hash for rsync
> that is big enough that the probability of a false negative in the
> hashed block is around the same as the probability of a media error.
>
> > The tradeoff between the journal and the rsync model
> > is one of disk space for the journal versus CPU time
> > for the rsync model. Certainly both could be
> > implemented, the journal could be queried first, and
> > if that fails, use the rsync method!
> > Thoughts?
>
> In the ideal world - yes. In practice, I think that just adding rsync
> capability for partial syncs would give most of the benefits for
> relatively little effort in terms of implementation.
>
> Gordan
>
>
> _______________________________________________
> Gluster-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
> _______________________________________________
> Gluster-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
Re: [Gluster-devel] Re; Load balancing ..., Gareth Bult, 2008/04/28
- Re: [Gluster-devel] Re; Load balancing ...,
Krishna Srinivas <=
- Re: [Gluster-devel] Re; Load balancing ..., Martin Fick, 2008/04/29
- Re: [Gluster-devel] Re; Load balancing ..., Gordan Bobic, 2008/04/30
- [Gluster-devel] mmap support, Dionisas, 2008/04/30
- Re: [Gluster-devel] mmap support, Mickey Mazarick, 2008/04/30
- Re: [Gluster-devel] mmap support, Mickey Mazarick, 2008/04/30
- Re: [Gluster-devel] mmap support, Anand Avati, 2008/04/30
Re: [Gluster-devel] Re; Load balancing ..., Krishna Srinivas, 2008/04/30
Re: [Gluster-devel] Re; Load balancing ..., Gareth Bult, 2008/04/30
Re: [Gluster-devel] Re; Load balancing ..., Gareth Bult, 2008/04/30