Re: [Gluster-devel] RFC on fix to bug #802414

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] RFC on fix to bug #802414

From:	Anand Avati
Subject:	Re: [Gluster-devel] RFC on fix to bug #802414
Date:	Tue, 22 May 2012 10:47:49 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:10.0.4) Gecko/20120422 Thunderbird/10.0.4

On 05/22/2012 01:44 AM, Raghavendra Gowdappa wrote:



----- Original Message -----

From: "Anand Avati"<address@hidden>
To: "Raghavendra Gowdappa"<address@hidden>
Cc: "Pranith Kumar Karampuri"<address@hidden>, "Vijay Bellur"<address@hidden>, "Amar 
Tumballi"
<address@hidden>, "Krishnan Parthasarathi"<address@hidden>, address@hidden
Sent: Tuesday, May 22, 2012 12:41:36 PM
Subject: Re: RFC on fix to bug #802414

<in continuation from our chat>

The PARENT_DOWN_HANDLED approach will take us backwards from the
current
state where we are resiliant to frame losses and other class of bugs
(i.e, if a frame loss happens on either server or client, it only
results in prevented graph cleanup but the graph switch still
happens).

The root "cause" here is that we are giving up on a very important
and
fundamental principle of immutability on the fd object. The real
solution here is to never modify fd->inode. Instead we must bring
about
a more native fd "migration" than just re-opening an existing fd on
the
new graph.

Think of the inode migration analogy. The handle coming from FUSE
(the
address of the object) is a "hint". Usually the hint is right, if the
object in the address belongs to the latest graph. If not, using the
GFID we resolve a new inode on the latest graph and use it.

In case of FD we can do something similar, except there are not GFIDs
(which should not be a problem). We need to make the handle coming
from
FUSE (the address of fd_t) just a hint. If the
fd->inode->table->xl->graph is the latest, then the hint was a HIT.
If
the graph was not the latest, we look for a previous migration
attempt+result in the "base" (original) fd's context. If that does
not
exist or is not fresh (on the latest graph) then we do a new fd
creation, open on new graph, fd_unref the old cached result in the fd
context of the "base fd" and keep ref to this new result. All this
must
happen from fuse_resolve_fd(). The setting of the latest fd and
updation
of the latest fd pointer happens under the scope of the
base_fd->lock()
which gives it a very clear and unambiguous scope which was missing
with
the old scheme.


I remember discussing this solution during initial design. But, not sure why we 
dropped it. So, Can I go ahead with the implementation? Is this fix required 
post 3.3?

The solution you are probably referring to was dropped because there wewere talking about chaining FDs to the one on the "next graph" as graphskeep getting changed. The one described above is different because herethere will one base fd (the original one on which open() by fuse wasperformed) and new graphs result in creation of an internal new fddirectly referred by the base fd (and naturally unref the previous "newfd") thereby keeping things quite trim.


Avati

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gluster-devel] RFC on fix to bug #802414, Anand Avati, 2012/05/22
- Re: [Gluster-devel] RFC on fix to bug #802414, Raghavendra Gowdappa, 2012/05/22
  - Re: [Gluster-devel] RFC on fix to bug #802414, Anand Avati <=

Prev by Date: Re: [Gluster-devel] metadata race confition (was: ename(2) race condition)
Next by Date: Re: [Gluster-devel] metadata race confition (was: ename(2) race condition)
Previous by thread: Re: [Gluster-devel] RFC on fix to bug #802414
Next by thread: [Gluster-devel] glusterfs-3.3.0qa43 released
Index(es):
- Date
- Thread