[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[patch #4442] Defer `cvs add' of directories until first file committed

From: Derek Robert Price
Subject: [patch #4442] Defer `cvs add' of directories until first file committed or run adds of directories through commitinfo
Date: Mon, 19 Sep 2005 18:30:22 +0000
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6


                 Summary: Defer `cvs add' of directories until first file
committed or run adds of directories through commitinfo
                 Project: Concurrent Versions System
            Submitted by: dprice
            Submitted on: Mon 09/19/05 at 18:30
                Category: None
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
        Originator Email: 
             Open/Closed: Open
           Fixed Release: None
   Fixed Feature Release: None



 A change to 'cvs add', either it goes through a
    commitinfo trigger for new directories (it also needs to
    get a CVS/Template populated when it is created which
    seems to be a minor bug), or going with Greg A Woods
    <woods@weird.com> suggestion to defer the connection to
    the server until the 'cvs commit' has been issued (ie,
    make cvs add client/server semantics like cvs rm
    symmetric). The thread is here:
    There is a long-standing issue here:

>From https://ccvs.cvshome.org/issues/show_bug.cgi?id=2:

Directories and administration directories and maybe administration files
created in the temporary directory on the server for many operations whether
is determined that any files will be created or not.  It should be possibe
delay this until it is deemed necessary.

------- Additional comments from dprice@h68.sny.collab.net Tue Jul 17
06:48:37 -0700 2001 -------

>From Brian Behlendorf:

Subject: Re: A big thanks!
   Date: Tue, 17 Jul 2001 03:20:00 -0700 (PDT)
   From: Brian Behlendorf <brian@collab.net>
     To: <Joseph_Kesselman@lotus.com>
     CC: <infrastructure@apache.org>, <dprice@collab.net>

On Mon, 16 Jul 2001 Joseph_Kesselman@lotus.com wrote:
> Seconded, and seconded -- it's a definite improvement!


So, tonight I looked at an interesting issue.  It gets back to how
much of a dog CVS really is, and now that it's isolated from
everything else, its dog-nature really comes out.

CVS checkouts and updates for large repositories are completely bound
in time by the speed of creating directories.  At least over pserver
(I'm not sure about CVS over SSH via CVS_RSH), a cvs update on a
repository module causes a dir in /tmp to be created that creates a
directory structure corresponding to the entire directory structure of
that module.  So for example, on jakarta-turbine where there's 583
different directories, add to that copying the CVS dir as well (which
it does), and an Entries, Repository, and Root file for every one of
those CVS dirs, and it's thousands of write operations, along with
corresponding read operations, and of course a read on every ,v file
in the repos.  This happens *even when there are no deltas to send at
all*.  And of course, that temp tree is deleted when the transfer is
completed.  That's why cvs updates appear to stall for a long time
right at the end.

So tonight, CVS checkouts seemed to be as slow as they were on
daedalus, so I decided to run "vmstat 5" on icarus.  da0, the disk
with /home on it, was pegged at 180-250 ops/sec.  That is a pretty
good ops/sec rate, I've been told, for 6ms drives.  Yes, I have
soft-updates turned on. =)

I turned off cvs completely, and killed all the current cvs processes
(sorry Sam).  ops/sec plummetted to zero.  I let one CVS process in,
to do a CVS update on xml-site (my test).  It pegged the ops/sec at
180-250 for the entire amount of time it took to update - about 30
seconds.  I tried it again, letting in 3 more CVS clients.  With four
clients banging away at doing their own cvs updates, it took roughly 2
minutes.  So any time we have concurrency, we start paying a hefty

There's got to be a way to avoid that build-the-directory-structure
step at the beginning.  I've cc'd Derek Price, who is a CollabNet
employee and CVS developer, who has been looking at a couple other
performance issues for us, seeing if there are easy wins for us to
tackle before Subversion's ready.  Derek, has there been any
discussion of the cost of this directory building, or ways to avoid
it?  What's the justification?


------- Additional comments from dprice@h68.sny.collab.net Tue Jul 17
07:00:31 -0700 2001 -------

>From Larry Jones:

Subject: Re: creating and removing and creating and removing....
   Date: Mon, 2 Jul 2001 14:51:41 -0400 (EDT)
   From: larry.jones@sdrc.com (Larry Jones)
     To: perry@wasabisystems.com (Perry E. Metzger)
     CC: bug-cvs@gnu.org

Perry E. Metzger writes:
> How hard would it be to make -d only create directories that need
> creating while the tree is being walked (and not ones that will end
> empty) and have -P remove directories as the tree is being walked so
> it doesn't have to walk the tree yet again?

Very.  There really isn't any way to know in advance whether the
directory is going to end up empty or not, so you'd either have to
duplicate most of the work of checkout without really creating
anything, or you'd have to arrange to defer all of the directory
creation until the first file.

-Larry Jones


Reply to this item at:


  Message sent via/by Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]