bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What shall the filter do to bottommost translators


From: Sergiu Ivanov
Subject: Re: What shall the filter do to bottommost translators
Date: Mon, 22 Dec 2008 19:19:50 +0200

Hello,

On Thu, Dec 18, 2008 at 4:13 PM, <olafBuddenhagen@gmx.net> wrote:

On Sun, Dec 14, 2008 at 11:34:57PM +0200, Sergiu Ivanov wrote:
> On Mon, Dec 8, 2008 at 9:28 AM, <olafBuddenhagen@gmx.net> wrote:
> I'll put my question differently: are we going to return the control
> ports (that is, ports to filesystem) inside a instances of struct node
> (provided libnetfs) as it happens in the case of magic lookups?

Well, I don't know libnetfs... It *sounds* like it would be serious
abuse (if it's even possible at all) -- but perhaps the "node" concept
in netfs is actually more generic than it sounds...
I'm rather inclined to view the node concept introduced in libnetfs as
a rather generic concept, and my question arose from such
understanding. Still, I realize that such use will be abusive, so I'd
like to ask now how would you suggest to implement this feature?
 
> > Checking to make sure I understand it correctly now: If a client
> > invokes dir_lookup, and the server finds some other translator in
> > the path of the lookup, it will get the translator's root node, and
> > return that to the client, along with the remaining file name
> > components, and with RETRY_REAUTH, right?
[...]
> What I see in the if block starting at line 186 in dir-lookup.c of
> libnetfs is that usually retries are not forwarded to the client when
> translators are encountered

Eh? I did not try to understand every bit of it; but from my reading, a
retry *is* returned to the client...

(It seems I was wrong on one point though: The retry type is not always
RETRY_REAUTH; in fact, in the most ordinary case, it's RETRY_NORMAL I
think...)
Yes, indeed, retries *are* forwarded. I made the mistake of looking
for REAUTH retries only and did not pay proper attention to how things
work in general... And yes, a normal retry is forwarded to the client
when a translator is encountered.
 
> However, if the translator is symlink (line 289) and the link target
> is specified as absolute path, a magic retry is requested (line 322).

Indeed, relative symlinks seem to be an exception -- the only case where
no retry is generated.
Yes, this is true, I can see it clearly now.
 
> > And what happens when doing a lookup for ".." on the root of a
> > filesystem? I would guess that it returns the filesystem's
> > underlying node, with ".." as the retry name, and RETRY_REAUTH. Is
> > this about right?
>
> I'm not really sure, but as I can see from the libnetfs'
> implementation of netfs_S_dir_lookup (dir-lookup.c, starting at line
> 112) it's not really so. As far as I can guess from the names of
> variables and the scarce comments in the declaration of struct peropen
> in /usr/include/hurd/netfs.h, just the parent node of the translator's
> root node is returned with FS_RETRY_REAUTH.

Well, guessing doesn't help us here :-)

Skimming over the code, it seems though that indeed no matter how deep
the translator stack, the parent directory of the original node is
passed on and on (through the "dotdot" parameter to fsys_getroot()), and
returned whenever doing a lookup for ".." on the translated node.
My understanding is similar :-) This makes me feel better ;-)
 
> > BTW, I realized another rather serious problem: I did suggest that
> > nsmux doesn't need to proxy file nodes, thus saving a lot of
> > overhead. This was based on the assumption that from a file node no
> > further lookups are possible, and thus no need to handle magic.
> > Unfortunately, this assumption was wrong: It's always possible to
> > invoke file_getcontrol and then fsys_getroot, and do lookups from
> > there...
>
> Well, do you suggest proxying file nodes, too?

Actually, I'm not sure about the consequences. The
file_getcontrol()->fsys_getroot() sequence is a very odd one -- what
would you provide as the "dotdot" node for fsys_getroot()?... I don't
think this possibility was really intended. 
 
In view of that, I'm rather tempted to assume that anyone who tries
doing this deserves to fail... Yet it's not very elegant. We should
always try to mimic the normal behaviour as well as possible -- if it
turns out that the assumption was wrong, and the situation does indeed
happen in practice, it may lead to rather hard to track down failures.
Hm... I'm not sure I can understand properly what you are talking
about. Why would this sequence be odd? I cannot think of any other
possibilities to traverse a translator stack than using this
sequence.

As for the ``dotdot'' node, nsmux usually knows who is the parent of
the current node; if we are talking about a client using nsmux, it is
their responsibility to know who is the parent of the current node.

OTOH, I am not sure at all about the meaning of this argument,
especially since it is normally provided in an unauthenticated
version.
 
Also, the fact that unionfs proxies all nodes, makes me somewhat uneasy
-- surely there must be a reason for that?...
Please correct me if I am wrong: proxying a node means the following:

1. A client requests lookup.
2. nsmux looks up the required file.
3. nsmux creates a new libnetfs node containing a port to the file
  looked up
4. nsmux returns a port to the newly created node to the client

If I understand things correctly, then unionfs does *not* proxy all
nodes. Similarly to nsmux, unionfs only proxies directory nodes.
Perhaps it should be made an option. Would that complicate the code a
lot?
Not at all, just several more lines of code.

> I wonder, whether it is very bad that further lookups are possible...
> Actually, right now I fail to see any reasons, why this is so very
> bad. Could you please point some out?

The problem is that lookups would get different results than lookups for
the same file names done in the usual manner (with nsmux in control).

However, as I said, I'm not sure how realistic such a situation really
is.
Well, I think I'll make an option for that, so that nsmux will be
ready to deal with any possible problems in this area in the future.

Do you think that choosing to proxy all nodes revives the necessity in
a node cache?
 
> What I am also thinking of is the fact that if we proxy file nodes,
> too, many more instances of libnetfs struct node will be created,
> which will slow down commands like ls -R and will, probably, have
> other negative consequences.

This is really the smallest problem. The real problem with proxying all
nodes is that *every* RPC gets overhead added by the proxying --
including possibly performance-critical ones like read()/write().
Yes, definitely... Sounds pretty evil...
 
> However, in view of the information I grasped while peering into
> libnetfs' dir-lookup.c, I won't really be so sure that I know anything
> about it now...

Really? I for my part, after looking at it (and at the lookup functions
in libc), feel that I have a pretty good picture now how it works...
Right now I feel better in lookup matters, but new questions have
popped out, so I still have reasons to return again and again to
netfs_S_dir_lookup.
 
> > I have a premonition that this issue might in fact come up as soon
> > as we start thinking seriously about the recursive magic we
> > postponed for now.
>
> Maybe, it's time we started discussing about this problem, too? The
> question how the filter should work proved to be tightly connected
> with recursive magic... Shall I start a new thread, what do you say?

I'm not sure. I'm rather reluctant to confuse things even further by
considering this now... OTOH, it *might* help in getting a better
overall picture. It's really hard to tell :-(
As I said on the IRC, if we are not going to do any long-term
planning, we should probably forget about recursive magic for now,
because we will definitely get things confused, while getting a better
overall picture is still a probability.
 
> Right now, when nsmux does not proxy control ports, once the filter is
> provided with the control port to the very first translator in the
> *static* translator stack, nsmux cannot know what the filter is doing.
> This means that when the filter finishes travelling across the static
> translator stack, it has no possibility of traversing the *dynamic*
> translator stack it is member of.

Well, I thought I made it clear that I consider not proxying the control
ports a bug -- and I'm basing further discussion on the assumption that
it will get fixed :-)
OK, I'll keep that in mind :-) However, I still have no idea how do
you suggest proxying the control ports, since we deemed the usage of
libnetfs nodes for this purpose to be an abuse.

> Hm... Suppose nsmux is asked to fetch 'file,,x,y'.

You mean "file,,x,,y" I assume?...
This syntax looks new to me :-) I don't remember seeing or using it
before.
 
> Right now it creates a shadow node, sets translator 'x' on the shadow
> node and then sets translator 'y' immediately on top of translator
> 'x'. Do you mean that nsmux should create a new proxy node for
> translator 'y'?

Yes.

> That is, shall it work in the following way:
>
> 1. Create a shadow node mirroring file 'file'; 2. Set translator 'x'
> on this shadow node; 3. Create a shadow node mirroring the normal port
> of translator 'x'; 4. Set translator 'y' on this *second* shadow node?

Exactly.
Hm, this sounds like news to me :-) I'll modify the code accordingly.

BTW, from several remarks you made at different times, I understand that
right now you process all magic suffixes present in a file name in a
loop?
Yes, it is so.
 
Wouldn't it be simpler and clearer to process only the first suffix, and
pass back any remaining ones as the retry_name, so that they
automatically get handled correctly when the client does the retry?...
(It seems to me that if you had done it that way, it would have been
clearer from the beginning how stacking of dynamic translators is to be
handled.)
Well, setting translators in a simple loop is a bit faster, since, for
example, you don't have to consider the possibility of an escaped
``,,'' every time (and this is not the only reason). OTOH, I'd rather
consider myself an adept of the following syntax: ``file,,x,y'', which
is also simpler to implement in a loop.

Unfourtunately, I cannot really feel the advantage of using retries in
the case of creating dynamic translator stacks.

BTW, could you please expound on the question why would we need extra
shadow nodes in a dynamic translator stack? I fail to see how this
would make our life simpler :-)
 
> The problem I was speaking about sounds as follows. Suppose you have a
> port to a file, but you do not know the file name. You cannot reopen
> the same port with different flags.

And I say, you can :-)

Looking at the dir_lookup implementation in netfs, you can clearly see
that the "" case results in a new protid (with new open flags) being
created -- unless I'm missing something *very* big...
Oh yes, you can :-) I'm still wondering how I managed to miss this
thing while playing with this stuff for about an hour in the summer...
 
> > Aside from repeating the whole lookup, it's also possible AIUI to
> > reopen a node (possibly with different mode) by looking up "" on it
> > -- so in a slightly altered variant, the above idea *should* indeed
> > work.
>
> I remember doing something like this and getting *exactly* the copy of
> the same port. Or, let me try to say it in other words: I had a
> instance of mach_port_t in which there was a port number. When did a
> lookup of "" on this port variable I got an instance of mach_port_t
> with just the same number as it was in the initial variable.
>
> I've even checked it now: yes, it is so. file_name_lookup_under does
> nothing good in this case and invoking dir_lookup does not really make
> sense here, since file_name_lookup_under calls dir_lookup anyway.

And here you err :-)

A glance at __hurd_file_name_lookup() (invoked from
__file_name_lookup_under()) shows that doing a lookup on "" is
special-cased, and will *not* result in dir_lookup() ever getting
called!
I see... I'll keep that in mind, thank you :-)
 
However, invoking dir_lookup() directly, you can pass "" very well, and
I'm pretty sure it will result in a new port with different flags, as
the server implementation suggests.
Yes, this works.
 
(In fact, you can do that even with __hurd_file_name_lookup_retry() when
passing RETRY_REAUTH along with a "" retry_name -- though I must admit
that I don't fully understand the purpose of the reauth mechanism, so I
don't know whether doing a reauth here is really a good idea...)
I think I'll try to glance into the matter soon to make things
clearer.
 
In the beginning, I have asked you several times to check how the lookup
functions are implemented in libc, and to try invoking dir_lookup()
directly to get better control. Now I wonder, did you ever actually do
that?...
No, I've never done that. The reason is that, at first, I was very
much confused by the abundance of information about the meaning of
which I had no inkling, and then I was so busy getting to know
libnetfs stuff, that I forgot about this request. I'll do that now,
though.

I'm not innocent myself though: I see now that if I had took the trouble
to look at the code myself back then, so I would know what I'm talking
about, it would have saved us both a lot of time...
The positive moment about this situation is that I had the occasion to
get more information because of going astray :-) I wish learning new
information by going astray had not been such a retarding moment about
the implementation of namespace-based translator selection.

Regards,
scolobb

reply via email to

[Prev in Thread] Current Thread [Next in Thread]