bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: purpose for xmlfs improvment -GSoC application -


From: olafBuddenhagen
Subject: Re: purpose for xmlfs improvment -GSoC application -
Date: Wed, 30 Apr 2008 01:11:49 +0200
User-agent: Mutt/1.5.17+20080114 (2008-01-14)

Hi,

On Mon, Apr 21, 2008 at 07:27:55PM +0200, Charly Caulet wrote:

> I have wrote some possibilities of xmlfs improvment in that document
> (http://deux-fleurs.net/analyse-xmlfs).

Usually it's better to include the Information directly in the mail body
rather than linking to it -- makes it easier to read and to reply...

So, here we go:

>     Each representation of a node has a name composed with the node's
>     name, a sharp "#", and and Unique ID.

If the ID is a unique number through all elements, probably makes more
sense to have it first... But we already discussed options for that in
the other subthread :-)

>     As we said before, each node has an unique ID. We must be able to
>     say which node is before which one in the XML file. So in each
>     node, you have a .hierarchy file that will content all the IDs of
>     the current's node children.

I agree with Fredrik that this is too cumbersome. And it has the same
problem as all the schemes proposed in the other subthread using static
numbers. (It *would* be possible to make it reassign numbers on changes,
but that would be gross...)

Also, this scheme doesn't guarantee atomicity: What about nodes that are
already created, but not in the structure; or the other way around?...
I'm not sure whether this is a serious problem, but it should be
considered.

Furthermore, I don't think it can be implemented efficiently, as the
translator would have to re-check the order off all nodes on any change
to the structure file...

>     When a node contents text, it is placed in a "text#<ID>" file.

As there is no syntactic difference between text nodes and element nodes
in this scheme, you need to use some special character in the name to
avoid ambiguity. ("text" could just as well be an element named
"text"...)

>     It is represented by a direcotry named ".comment#<ID>". This
>     directory has the same properties than an Element. I have chosen
>     to don't represent comments by a simple file because it wouldn't
>     have been easy to comment a part of the xml file.

I'm not convinced this is a good enough reason to deviate from DOM here,
and treat comments differently from everything else...

>     With the explained file system, the path to access XML elements
>     might be close from DOM. 

Not really.

For one, in DOM IIRC the order of the nodes is established by a linked
list, not by any numbering scheme and/or additional structure
descriptions. While these might be equivalent at some very abstract
level, we need to consider the possibility of sticking more closely to
DOM -- using next/previous symlinks in the nodes for example...

Inserting nodes is another point: It's a while since I used DOM; so I'm
not sure about the details: But IIRC that is done by invoking methods
like insert_before() or something the like on a node...

Now you may wonder how to invoke methods with a directory structure...
Nothing easier than that: Remember that while the translator exports
something looking like a traditional directory structure, there is
absolutely no reason for it always to behave like a boring old
filesystem -- some magic can spice it up considerably ;-) You could for
example have magic entries like .insert_before, and if something is
written there, a new node automatically pops up in the tree structure...
Or maybe do something strange on the prev/next links. Many possibilities
here.

Of course, it would be somewhat awkward, if instead of just creating the
directory for the new node directly where the node is to live, you have
to do it indirectly through some magic entry in another node. It's a
tradeoff between being closer to DOM, and being closer to a traditional
filesystem; and requires very careful consideration to find the best
compromise. (In some cases, it might be possible to implement two
different methods, so the user can choose...)

> XPath's syntax is different so I don't really know how to translate
> complex requests like
> '~/DirWhereIsSetTheTranslator[@goudi:name="youhou"]' maybe will we
> have to code a layer between the 'mkdir', 'grep', 'cd' (...) commands
> and the user to be able to use "DOM://dompath" and "XPATH://xpath"
> paths.

There is no need for any special layer. The translator can parse the
XPath expressions just fine. The only problem with that is that a single
node can be represented in XPath in many different ways (endless ways
actually); so it's not possible to represent every possible XPath
expression as a static file that shows up on ls. Rather, they need to be
implemented as virtual paths, that can be accessed only when explicitely
specified.

It could be possible though to pick one simple unambiguos XPath syntax
and use it for the static representation of the document contents...
That's why the task suggested considering the tree representation with a
view on both DOM *and* XPath. I'm not sure what is possible here (I know
very little about XPath), but it's certainly worth looking into.

(BTW, there is also XQuery, which AIUI is based on XPath but allows
altering the document also... Probably worth looking into as well.)

>     I think a fsck.xmlfs that uses a DTD and an xmlfs would be useful
>     to check xmlfs validity.

Basically that would just open the underlying XML file in non-translated
mode and run an normal XML validator on it... Not sure whether fsck is
really a good abstraction; but the functionality might be convenient
indeed :-)

-antrik-




reply via email to

[Prev in Thread] Current Thread [Next in Thread]