gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cvs2arch (was Re: [Gnu-arch-users] an hack.. one night long)


From: Tom Lord
Subject: Re: cvs2arch (was Re: [Gnu-arch-users] an hack.. one night long)
Date: Sat, 23 Aug 2003 13:24:48 -0700 (PDT)


    > From: wave++ <address@hidden>

    > > Very interesting how the tla commit time closely tracks the additional
    > > size of the {arch} directory tree. I guess the only change there should
    > > be the addition of one logfile, which should be very quick for diff.

    > In fact, tla should only diff the pristine tree against the local
    > sources and produce a patch. (Am I missing something?).  This should be
    > almost constant over time considering that the size of the sources isn't
    > changing much.

The number of disk blocks occupied by the trees increases at almost
exactly the same rate as `commit' time.  The cumulative size of the
source files increases at a smaller rate.

The near perfect correlation of disk-space-for-one-tree with revision
number is a peculiarity of your particular project (though many
projects seem to exhibit a similar correlation for a bounded period of
their history).

So: you'll be wanting the inode signature optimization -- to spare tla
from reading so many files.  As I mentioned, there's even a first-cut
at it in the patch queue.   

You _might_ be wanting a better file system -- to impose less overhead
not on disk space, but on the amount of I/O needed to read the tree.
Such filesystems have been around for a long time in the BSD world.
I'm not sure why they haven't been a higher priority in the Linux
world.


    > >> arch @ patch 280 is roughly 8 times slower than arch @ patch 5 in this
    > >> case.

    > > Notice how it closely tracks the size of the complete archive, while the
    > > size of the actual source tree isn't changing much. Either you do not
    > > have a pristine tree and tla has to check out the last head revision on
    > > every commit, or something fishy is going on in the code that tries to
    > > diff the logs for the working directory with those of the pristine tree.
    > > Only one file should be added and that can not be all that expensive for
    > > diff although I guess that tla is reading all the data in both trees
    > > twice, once by tla because of a bad call to safe_stat in libarch/diffs.c
    > > and the second time by diff because timestamps are not trusted.

    > Seems like it tries to do something with old patches (like regenerating
    > the head every time).

Why do you believe that?



    > I'll now try to profile it and see what happens. I already converted
    > almost all my cvs repositories without problems.

    > Meanwhile, anyone aware about a recent "tla tag" bug/issue?

Is it in the bug database?

-t




reply via email to

[Prev in Thread] Current Thread [Next in Thread]