gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Proposal to change locking in data-self-heal


From: Xavier Hernandez
Subject: Re: [Gluster-devel] Proposal to change locking in data-self-heal
Date: Wed, 22 May 2013 12:36:48 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130510 Thunderbird/17.0.6

Maybe a different approach could solve some of these problems and improve responsiveness. It's an architectural change so I'm not sure if it's the right moment to discuss it, but at least it could be considered for the future. There are a lot of details to consider, so do not take this as a full explanation, only a high lever overview.

The basic change is to implement a server-side healing helper (HH) xlator living just under the lock xlator. It's purpose is not to heal the file but to offer functionalities to aid client-side xlators to heal a file.

When a client wants to heal a file, it will first send a request to the HH xlator to request healing access. If the file is not being healed by another client, the access will be granted. Once one client have exclusive access to heal the file, a full inode lock will be needed to heal the metadata at the beginning and the end of the heal process (just like it's currently done). Then all locks are removed and the data recovery can be made without any lock.

To be able to heal data without locks, the HH xlator needs to keep a list of pending segments to heal. Initially the segment will go from offset 0 to the file size (or something else defined by the client). Since the HH xlator is below the lock xlator, it can only receive one normal write and, possibly, one heal write at any moment. Normal writes will always take precedence and the written segment will be removed from the healing segments. Any heal write will be filtered by the pending segments: if a heal write tries to modify an area not covered by the pending segments, that area is not updated.

This strategy allows concurrent write operations with healing.

In this situation it's easy to handle a truncate request: the HH xlator intercepts it and updates the pending segments, excluding any segment starting at the truncate offset. If this results in an empty segment, the HH xlator will tell the healing client that the healing is complete.

Al 21/05/13 15:58, En/na Jeff Darcy ha escrit:
On 05/21/2013 09:30 AM, Stephan von Krawczynski wrote:
I am not quite sure if I understood the issue in full detail. But are you
saying that you "split up" the current self-healing file in 128K chunks
with locking/unlocking (over the network)? It sounds a bit like the locking takes more (cpu) time than the self-healing of the data itself. I mean this can be a 10 G link where a complete file could be healed in almost no time,
even if the file is quite big. Sure WAN is different, but I really would
like to have at least an option to drop the partial locking completely and
lock the full file instead.

That's actually how it used to work, which led to many complaints from users who would see stalls accessing large files (most often VM images) over GigE while self-heal was in progress. Many considered it a show-stopper, and the current "granular self-heal" approach was implemented to address it. I'm not sure whether the old behavior is still available as an option. If not (which is what I suspect) then you're correct that it might be worth considering as an enhancement.


_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]