info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Large file actual performance report; cvs use of ,v header is someti


From: Bulgrien, Kevin
Subject: RE: Large file actual performance report; cvs use of ,v header is sometimes non-optimal.
Date: Fri, 18 Jan 2008 09:12:52 -0600

> Also considering the age of his copy of CVS

It and older versions have served us well for many years.  It is
in use here because commercial packages in past years trashed
repositories.  CVS has sustained no losses that were not
human induced.  This was not a criticism of CVS in the
least.

> I would have to 
> wonder what kind of specs does the machine have?
> 
> He does give some of them...
> "* Server is old (Dell PowerEdge 2300; Dual Pentium II 400 
> MHz; 1 GB RAM,
>      disks are SCSI RAID) but server idles with very little 
> CPU usage."
> 
> Guess:
> 3+ year old machine, working for a big contractor...
> http://www.dell.com/content/topics/global.aspx/corp/pressoffic
> e/en/1999/1999_02_26_rr_003?c=us&l=en&s=corp

Your guess is on par.  The machine is an IT throwaway.  This design
group runs their own servers.  If you think that one is bad, you
should see the backup system if this one goes down.  We have never
had to control files this "big" before, and the system has been fine
for years, with no complaints on speed.

The system has 5-6 regular users, none of which use X.

> Data on a RAID array capable of sustained 20MB/s

Right.  Onboard RAID controller configured with OS on one mirror array
and data on the other array.  /tmp is on the OS array.

> /tmp on the boot drive, capable of ~10MB/s, and with very 
> little space to 
> spare (i.e., less than 150% the size of the WHOLE repository)

/tmp = 4x total repository size.

> Possibly working across a 100Mb/s (12MB/s theoretical) network
> using NFS as the sandbox directory.

Local sandbox directory.  Also used remote, but did already try to see if
working local made a noticeable difference.  It did not, but then again,
I wasn't measuring, and in fact left the long operations run overnight,
but there was no noticable difference.  The server time consumption
was local even for remote ops as the sandbox was not on a mounted
directory.

> The purpose of this Guess was not to denigrate, it was to 
> point out OTHER 
> things that can affect the speed at which CVS accomplishes it's work.
> Connection method can make a difference too.

Right, and quite helpful I might add, instead of pointing out that best
practices weren't followed, which BTW was fully disclosed in the OP.
Note there was nothing in the OP that denigrated CVS.  It was an FYI
e-mail with the developer being blamed for all but the failure to
traverse the ,v file header.

> And a bit about the troubling file
> "the repository size of the file is on the order of 315 MB."

Yes, indeed, top showed that.

> which means that there is less than 650MB of ram left 
> (assuming nothing else 
> like X is running on the machine, and tmp is not a ram file 
> system) to process 
> the file (and  IIRC CVS copies the file to tmp first so a bit 
> more is gone). 
> How much swap was being used before, during and after the operation???

X was not active except that the console does have a graphical login
manager running.

Swap wasn't significantly affected.  Normal for swap is < 3MB.  During the
operations, I looked at it and do not recall exact figures, but did note
that AFAICT was not significant and was less than < 10MB if at all above
normal.

> Does this server serve anything else
>   NFS, SAMBA, IMAP, HTTP, PostgreSQL

NFS
  No.

Samba
  Yes.  Load is intermittent, but rarely significant.  Some developers use
  samba shares as their working directories, but this usage was handled
  well enough by the second processor (except of course considering that
  the disk accesses would have been on the same controller as the repo.

IMAP
  No.

HTTP
  Yes. Very light internal use.  Normal is max of two idling servers with
  no active connections.

Postgresql
  Yes. Very light.  Supports a very lightly used application.

> It might not be eating the processors, but each of these requires RAM.

Most assuredly.

> Then there is the troubling bit about
> {so the next commit was destined to fail since the working directory
> CVS/Entries file is now erroneous and marks the working 
> directory file as
> a new deleted revision 1.25.  See the prior post "FYI: cvs can break a
> checked out working directory" for details.  }
> Along with:
> {The commit of the new 1.23 should have been quite fast (on 
> the order of
> minutes) because it is took place at the top of HEAD, but 
> instead it takes
> perhaps 8-12 hours, and, in fact, fails with an error saying 
> 1.25 can not
> be found.  This is the situation where the title "cvs use of 
> ,v header is
> sometimes non-optimal" comes to play.}
 
> 1) I suspect this could be solved with either an sandbox 
> update or new 
> checkout.  The manual should probably indicate something 
> along the lines of 
> "If you are dangerous enough to mess with the dragons in the 
> `cvs admin` 
> command, then (before and) afterwards you need to tell all 
> users 'Hey! You out 
> of the pool! do a new checkout before attempting any new work.'"....
> OH wait!! Quoth the book of cederqvist
> file:///tmp/cederqvist-1.11.21.html/cvs_16.html#IDX236
> last paragraph:
> "Make sure that no-one has checked out a copy of the revision 
> you outdate. 
> Strange things will happen if he starts to edit it and tries 
> to check it back 
> in. For this reason, this option is not a good way to take 
> back a bogus commit".

Well, sure.  The sandbox was borked, but this wasn't realized
until after the operation.  And of course, per the reference
article, update after admin -o would have prevented the problem.
We are all human, and we all make occasional oversights.  All of
this was fully disclosed in the OP and in referenced posts.
Amazing how a poster can admit to something and people still
have to hammer on it anyway, like that helps anyone want to post
to the list, but at least this comment has facts and manual
references, so comes across not so personal.
 
> 2) I tend to agree that CVS should have at least checked the 
> repository for 
> 'is 1.25 still the head/does it still exist?' on commit 
> before working on 
> anything else, because it always checks to see if you need to 
> do an update 
> before commit.  Is this already the case in newer CVS's?
> And before someone says 'but he messed up the repo', which I 
> agree... it 
> should be trivial while checking 'if [current repo HEAD rev > 
> current Entries 
> rev]' also check 'if [current repo HEAD rev < current Entries 
> rev]; {throw 
> error/check to see if Entries rev still exists}'... at least 
> for someone who 
> knows where the first test is taking place now in the code.
> [Arthur, if you look this up in CVSNT, please drop a note to 
> what file it is 
> in for CVSNT, and the 10 lines around it... If I have to make 
> a patch that 
> would maybe help me find it in CVS.]

I did not even go so far as to say CVS should have done anything
as for all I know, the traversal of the diffs may have been
intentional even though I could not discern that.

Kevin R. Bulgrien

This email message is for the sole use of the intended recipient(s) and may 
contain General Dynamics SATCOM Technologies confidential or privileged 
information.  Any unauthorized review, use, disclosure or distribution is 
prohibited.  If you are not an intended recipient, please contact the sender by 
reply email and destroy all copies of the original message.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]