[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] Merging suspend branch
From: |
William Uther |
Subject: |
Re: [Monotone-devel] Merging suspend branch |
Date: |
Sat, 14 Jul 2007 21:03:42 -0700 |
Zack Weinberg wrote:
On 7/14/07, William Uther <address@hidden> wrote:
I've been playing with the branch, and it does slow down listing
branches quite a bit. Listing branches isn't too common an
operation, but I switched things up to try to help this:
- The main 'suspend' cert change is now in
project_t::get_branch_heads().
- I've added a new lua function: get_branch_heads() which gets the
branch heads :)
- I've added a default ignore_branch() lua hook which checks if
there are any branch heads and
ignores the branch if there are not.
The result is basically the same as before, except that you can
choose to switch off the checking for heads behaviour when listing
branches (by changing the lua hook), which makes makes quite a speed
difference (if you do this operation a lot).
I gotta say I'm not real enthusiastic about this hook. If the only
reason anyone would ever change it is to speed up an operation at the
expense of clarity in the results, I'd argue that we should fix the
performance problem instead... Can you think of any other reason
someone would change it?
The ignore_branch() hook already existed. It just didn't have a default
implementation. I moved some of my old patch into a new default
implementation.
I'm not particularly enthusiastic about the get_branch_heads() lua
function though. If there
are any errors in there, then it'll pop out as a lua failure but all
the debug output gets lost.
I then started looking at why finding branch heads is so slow. The
answer is simple: project_t::get_branch_heads() first gets every
revision in the branch, then erases ancestors. Getting the heads of
every branch means considering every revision in the DB. It might be
nice some time in the future to cache the heads of each branch in
the db.
The information is already kinda-sorta in the database ... there's a
"revision_ancestry" table that just contains (parent, child) pairs.
You can pull out all revisions that are not parents of any other
revision with one SQL query:
SELECT child FROM revision_ancestry WHERE child NOT IN (SELECT
parent FROM revision_ancestry)
-- on my local database, that executes in 0.2 seconds (including time
to format the results neatly and write them to /dev/null).
OOOoooo
That's a much better solution :)
I'll change things back the way they were with that in there...
hrm. We'll need to be a little trickier. If a head of a branch has
a child in another branch then your query not return it, but it could
still be a head.
I've never done a lot of SQL, but I think we want something like
(and if someone were to clean this up, that would be good):
SELECT revision_certs.id FROM revision_certs,
revision_ancestry // get the id of revs
WHERE revision_certs.id =
revision_ancestry.child // that are
children in the ancestry table
AND revision_certs.name = ? AND revision_certs.value
= ? // and are in the branch we're looking for
AND revision_ancestry.child NOT
IN // and are not
(SELECT revision_ancestry.parent FROM revision_ancestry,
revision_certs WHERE // parents in the ancestry table
revision_ancestry.child = revision_certs.id
AND // where the corresponding child's
branch
revision_certs.name = ? AND revision_certs.value
= ? // is the branch we're looking for
)
Hrm. If you have a single revision in your DB, does it get entered in
the ancestry table? Or do we need to extend this further.
And even that wont work, because you're not checking the validity of
the certs. We may not be trusting certs by specific individuals, but
we will still have removed revisions from the list of heads if an
untrusted person has committed a child.
After the revs in a branch have been gathered, this is the function
that does the work finding the heads: erase_ancestors_and_failures()
And it has this comment:
// this call is equivalent to running:
// remove_if(candidates.begin(), candidates.end(), p);
// erase_ancestors(candidates, app);
// however, by interleaving the two operations, it can in common
cases make
// many fewer calls to the predicate, which can be a significant
speed win.
//
// FIXME: once we have heights, we should use them here. The
strategy will
// be, to calculate the minimum height of all the candidates, and teach
// accumulate_ancestors to stop at a certain height, and teach the
code that
// removes items from the candidate set to occasionally rescan the
candidate
// set to get a new minimum height (perhaps, whenever we remove the
minimum
// height candidate).
I don't understand heights. Should I leave things the way they were
and wait for that FIXME to be fixed?
Cheers,
Will :-}