Re: What shall the filter do to bottommost translators

On Mon, Dec 29, 2008 at 8:25 AM, <olafBuddenhagen@gmx.net> wrote:

On Mon, Dec 22, 2008 at 07:19:50PM +0200, Sergiu Ivanov wrote:
> I'm rather inclined to view the node concept introduced in libnetfs as
> a rather generic concept, and my question arose from such
> understanding. Still, I realize that such use will be abusive, so I'd
> like to ask now how would you suggest to implement this feature?

Not knowing libnetfs, I can only make rather wild guesses...

In a sense, we are implementing multiple filesystems: One for each
filesystem in the mirrored directory tree.

The most radical approach would be to actually start a new nsmux
instance for each filesystem in the mirrored tree. This might in fact be
easiest to implement, though I'm not sure about other consequences...
What do you think? Do you see how it could work? Do you consider it a
good idea?

I'm not really sure about the details of such implementation, but when
I consider the recursive magic stuff, I'm rather inclined to come to
the conclusion that this will be way too much... I'd rather suggest
implementing something thinner than nsmux to proxy specifically
filesystems or sticking with just a single instance of nsmux.

Probably, this variant could work, but I'll tell you frankly: I cannot
imagine how to do that, although I feel that it's possible. I could
probably dig a bit deeper and come up with more detailed conclusions,
but I'm not sure whether it is worth it. What do you say?

But let's assume for now we stick with one nsmux instance for the whole
tree. With trivfs, it's possible for one translator to serve multiple
filesystems -- I guess netfs can do that too...

Could you please explain what do you mean by saying this in a more
detailed way? Unfourtunately, I can but very vaguely imagine what are
you trying to tell me...

Though it might be
tricky, because here we don't attach the translator to multiple
locations in the "outside" filesystem, but rather attach to one outside
location, while the other filesystems are in turn attached to nodes
served by nsmux itself -- I wonder whether this recursion would cause
any complications.

I'm sorry, I don't get it :-(

Even if it works, it might be too much overkill, though. The alternative
would be to override the default implementations of some of the fsys and
other RPCs dealing with control ports, so we would only serve one
filesystem from the library's point of view, but still be able to return
different control ports.

As we override the standard implementations, it would be up to us how we
handle things in this case. Easiest probably would be to store a control
port to the respective real filesystem in the port structure of every
proxy control port we return to clients.

This is variant I was thinking about: custom implementations of some
RPCs is the fastest way. At least I can imagine quite well what is
required to do and I can tell that this variant will be, probably, the
least resource consuming of all.

> > The file_getcontrol()->fsys_getroot() sequence is a very odd one --
> > what would you provide as the "dotdot" node for fsys_getroot()?... I
> > don't think this possibility was really intended.

> Hm... I'm not sure I can understand properly what you are talking
> about. Why would this sequence be odd? I cannot think of any other
> possibilities to traverse a translator stack than using this sequence.

Eh? Traversing translator stacks -- just like normal lookup -- uses
file_get_translator_cntl(), not file_getcontrol(), right?

Yes, yes, that's right... I've mixed up function names.

> As for the ``dotdot'' node, nsmux usually knows who is the parent of
> the current node; if we are talking about a client using nsmux, it is
> their responsibility to know who is the parent of the current node.
>
> OTOH, I am not sure at all about the meaning of this argument,
> especially since it is normally provided in an unauthenticated
> version.

AIUI it is returned when doing a lookup for ".." on the node returned by
fsys_getroot(). In other words, it normally should be the directory in
which the translated node resides.

Yep, this is my understanding, too. I guess I have to take a glimpse
into the source to figure out *why* this argument is required...

As the authentication is always specific to the client, there is no
point in the translator holding anything but an unauthenticated port for
"..".

Sorry for the offtopic, but could be please explain what do you mean
by authentication here? (I would just like to clear out some issues in
my understanding of Hurd concepts)

My point was that if we obtain the control port with file_getcontrol(),
rather than file_get_translator_cntl() as in a normal lookup, we lack
the context: We can use file_getcontrol() to check what translator
serves a given file, but we don't know where the translator resides in
the directory tree. Thus, calling fsys_getroot() and doing further
lookups from there doesn't really seem to make much sense in this
situation... Which is why I think we might be able to get away just
ignoring this case.

Aha, I see. Probably, we can indeed forget about this situation.

> If I understand things correctly, then unionfs does *not* proxy all
> nodes. Similarly to nsmux, unionfs only proxies directory nodes.

Oh, I thought you said at some point that unionfs proxies all nodes...
Guess I mixed it up. If unionfs doesn't, I'm pretty confident we do not
need to do it either :-)

Aha, ok :-) I've got it :-)

> > > Hm... Suppose nsmux is asked to fetch 'file,,x,y'.
> >
> > You mean "file,,x,,y" I assume?...
> >
> This syntax looks new to me :-) I don't remember seeing or using it
> before.

That's pretty strange... I indeed can't find me mentioning this syntax
anywhere in the list archives. But I'm almost sure I must have used it
on IRC at least...

I don't really remember, but that's not a problem.

Anyways, even if I never stated it explicitely, I don't see any reason
to consider using any *different* syntax. This is really the obvious
one. We use one suffix to set a dynamic translator, and then another
suffix to set a second one on top of it... No need to special-case this
in any way.

I see... This syntax indeed wants a new shadow node for each new
translator in a dynamic translator stack.

> Well, setting translators in a simple loop is a bit faster, since, for
> example, you don't have to consider the possibility of an escaped
> ``,,'' every time

Totally neglectible...

Of course :-) I've got a strange habit of trying to reduce the number
of string operations...

> (and this is not the only reason).

What are the others?

The main reason that makes me feel uneasy is the fact that retries
actually involve lookups. I cannot really figure out for now what
should be looked up if I want to add a new translator in the dynamic
translator stack...

Although I cannot deny the fact that using some smart retry mechanism
would be much more readable and extensible than a loop.

> Unfourtunately, I cannot really feel the advantage of using retries in
> the case of creating dynamic translator stacks.

Uniformity for one: You agreed that we should forward the retries when
crossing translator boundaries on the mirrored filesystem, and also when
continuing the lookup after starting a dynamic translator. Why should we
do differently when starting multiple dynamic translators, conflating
the lookup steps into one, instead of forwarding the retries after
starting each individual translator?...

Hm, this does sound very reasonable, and it's a pity I've never looked
at this matter from this point of view... I think I've got this thing
sorted out. I'm just hesitating about what should be looked up after
starting up an individual dynamic translator. As soon as you give me a
hint, I'll put my hand on that code :-)

> BTW, could you please expound on the question why would we need extra
> shadow nodes in a dynamic translator stack? I fail to see how this
> would make our life simpler :-)

It's not even about simplicity: It's about correctness.

Well, correctness does make life simpler ;-) At least for me :-)

It's harder to come up with a good example when dealing only with
single-node translators; but when considering translators serving
directories, it really becomes quite obvious: The first dynamic
translator exports a directory. Now in the original lookup we have
another magic suffix, so this directory is further translated. However,
further lookups can be performed on the directory, and such further
lookups might want to skip the second translator. They can actually use
another magic suffix to invoke a different second translator on top of
the same first one. Or someone might even set a static translator on the
directory provided by the first dynamic one...

Yes, I can see now... Thank you for explanation :-)

In these cases it is crucial that we do *not* see the second dynamic
translator resulting from the original lookup, while accessing the
directory provided by the first dynamic translator. In other words, the
second dynamic translator must sit on a shadow node, just like the first
one. That's the definition of dynamic translators: They are visible only
for their own clients, but not from the underlying node. (No matter
whether this underlying node is served by another dynamic translator.)

That seems clear. What makes we wonder, however, is how a filter will
traverse a dynamic translator stack, if it will not be able to move
from a dynamic translator to the one which is next in the dynamic
translator stack?

Having said that, I still believe that this also simplyfies things, as
this way we don't need any special-casing when setting more than one
dynamic translator. We just start the first one, and push the ball back
to the client -- the resulting retry will make us start the second one
in exactly the same manner as the first.

This is indeed a good strategy, as soon as all *_S_dir_lookup routines
do likewise when they encounter a translator.

Regards,

scolobb

From:	Sergiu Ivanov
Subject:	Re: What shall the filter do to bottommost translators
Date:	Wed, 31 Dec 2008 14:42:21 +0200