monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] basic_io inventory


From: Thomas Keller
Subject: Re: [Monotone-devel] basic_io inventory
Date: Sat, 28 Apr 2007 22:48:07 +0200
User-agent: Thunderbird 2.0.0.0 (Macintosh/20070326)

Stephen Leake schrieb:
>> The problem is that grep'ing the output like this makes you assume
>> that lines in the stanza are outputted exactly the way they are, i.e.
>> aren't reordered, don't get additional spaces, etc. While basic_io
>> parsers normally make absolute _no_ assumptions if their stanzas are
>> properly indented, your test code shouldn't either.
> 
> Ah. That makes sense.
> 
> So if the output changes format slightly, but within the bounds a
> standard parser will tolerate, the test does not have to be updated.

Correct.

> Is there some documentation on these requirements for basic_io
> parsers? I found basic_io.hh and luaext_parse_basic_io.cc; not many
> comments :). I searched the wiki for "basic_io" and "basic io"; I
> found BasicIoFormalization, which doesn't say much, but hints at
> "formal docs" somewhere.

Not to my knowledge, my own basic_io parser is whitespace tolerant at
least (excluding \n) and copes with simple reordering.

> It does point to an IRC session, which says that stanzas are not
> actually meaningful. Which is a scary concept; it implies that in the
> 'automate inventory' output, the 'path' and 'old_node' lines are
> _not_ related to each other. Which is of course nonsense!

Well, a stanza in my personal definition is a set of lines, like

foo "bar"
bla [baz]

and as soon as such a line ends with more than one single \n, the next
stanza starts:

stanza "first"
foo "bar"

stanza "second"
foo "baz"

All lines within such a stanza belong logically together somehow, but
there is no grammar of actually _how_ they belong together or in what
order they pop up. This is up to each command which outputs these stanzas.

Of course most commands write stanzas out in a way so these start with
an identification line of whatever is described there. "file" and "dir"
for the manifest format, "add_file", "add_dir" and "patch" for the
revision format. Again, this is not enforced by the basic_io format at all.

> I also browsed monotone.info for 'basic_io'. The definition of
> 'parse_basic_io' also makes it clear that stanzas are meaningless. You
> said earlier that "lines can be reordered". How then do I associate
> 'path' with 'old_node'? the "closest line"? the "previous line"?

See above my definition of "stanza".

> One IRC comment mentioned "the basic_io grammar in the manual's
> appendix"; I don't see that in monotone.info.

Hrm... I don't see that either.

>>> What do you think about outputing 'none' explicitly, rather than
>>> leaving out 'old_node' and 'new_node'? It would make the parser more
>>> regular. But maybe there's a standard basic_io style?
>> Well, "basic_io style" is: leave out what is already implicit =)
> 
> Ok. Does that extend to "leave out what the user has requested be
> ignored" ? :)

If speed is a problem for you here, do it. Certainly I do care about
speed here as well, but I plan to implement this differently. While your
proposal might make you read in your current test workspaces faster,
even bigger workspaces would still need too long to be scanned at once.

So, my proposal, or at least my planned implementation for guitone, is
to use inotify together with restrictions. Qt provides since 4.2 a nice
interface for this, named QFileSystemWatcher. So, I have one or more
threads in the background which look for updates in the current
workspace and trigger mtn automate inventory for new information on the
fly. If a user hits commit, I already have all needed information
available I need to actually commit all files which have changed.

>> Because these node numbers are completly internal. You should not rely
>> on them having specific values, because maybe they change if you
>> change the platform (32bit <> 64bit) (this is just a guess). Anyways,
>> why should you even bother? They're completly uninteresting =)
> 
> If they are truly uninteresting, they don't need to be in the output
> in the first place. 

Their only purpose here is rename matching (and currently determination
of adds and drops). Their actual value for a single node itself is
meaningless.

> The simplest way to test for that, _if_ the node ids are repeatable,
> is to check for the explicit value.

A node's id is basically its internal identification number which is
AFAIR used to track the whole lifetime of this node. So, if you test for
a value here, you should test that there are no two distinct nodes that

a) have the same node id in old_node
b) have the same node id in new_node

> A more complex way would be to use the node_id to retrieve some
> information about the file, and check that it matches.

There are no other interfaces which take a node_id as argument.

> I suggest we check for the explicit value now, with a comment in the
> test about how we are not sure that is truly portable, and suggestions
> on what to do if it turns out not to be. There's enough work to do
> without looking for trouble.

Ok.

Thomas.

-- 
ICQ: 85945241 | SIP: 1-747-027-0392 | http://www.thomaskeller.biz
> Guitone, a frontend for monotone: http://guitone.thomaskeller.biz
> Music lyrics and more: http://musicmademe.com




reply via email to

[Prev in Thread] Current Thread [Next in Thread]