[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bidirectional repository synchronization with CVSup - how?

From: Eric Siegerman
Subject: Re: Bidirectional repository synchronization with CVSup - how?
Date: Sat, 22 Sep 2001 01:18:59 -0400
User-agent: Mutt/1.2.5i

On Fri, Sep 21, 2001 at 03:40:23PM -0500, Art Eschenlauer wrote:
> We need to discover or learn about a method whereby changes checked
> into one repository may be pulled into copies of that repository
> located at our multiple other locations; in effect, each repository
> must "become the master" for changes submitted to it so that those
> changes can be replicated to other repositores.

As Greg says in another message, this isn't going to work too

> By the way, 
>  - we are using compression, but it doesn't eliminate our problem
>  - we have found that the performance is too low to have 
>                  or commit to a single master repository

If the reason you want replication is simply to offload a
saturated server and its Internet connection, here's a kludgy
sort of hybrid scheme you might try.  I've never done this, but I
don't see why it wouldn't work.  The idea is to distribute the
repo -- and thus the workload -- but without replicating it.

Put some of your project's modules in a CVS repo on Server 1, and
others on Server 2 (generalize to some appropriate N servers).
The servers don't all have to be in one place; all you need is to
make sure each user has SSH access to them all (or whatever
connection method you like, but SSH is by far the safest).

When you create a new sandbox, you have to cobble it together
with a series of "cvs -d <wherever> co" commands to the various
repositories (I'd write a script to automate it).  But once the
sandbox has been set up, normal CVS operations (commit, update,
log, diff) will transparently fan out to the various servers
without further user intervention (except typing passwords; see

If there's significant locality of reference as to what people
work on, you can optimize things by putting each module on a
repository close (network-topology-wise) to its major users.
E.g.  if people at location 1 do most of the work on module A,
while people at location 2 only touch it occasionally, put module
A's CVS repository on a machine at location 1.  Then, the people
who use it most will access it at LAN speeds, and only the
occasional users will have to suffer Internet-speed access.

If people at location 1 are the *only* ones to change module A,
and everyone else just reads it, the other sites can optimize
still further: they can keep one checked-out working copy of that
module in a central, read-only location, with the users'
sandboxes linked to it (via symlinks, Makefile vpath's, or
whatever suits you).  Or you could use CVSup to propagate changes
*uni*directionally from each module's master repo to slave repo's
at each remote site (use file permissions to enforce that the
slave repo's are read-only, otherwise things will get completely
bollixed up!)  Either way, a given location only has to pay the
price of updating each remote module once, rather than for every
user, and each server has to serve correspondingly fewer updates.

One downside to this scheme is that every time a CVS operation
touches a new repo, the user will have to type their passphrase
again.  That's annoying enough when there's only one repo!  But
if you choose SSH as your connection mechanism, people can use
ssh-agent (part of the SSH package) to get around the problem.


|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.        address@hidden
|  |  /
The world has been attacked.  The world must respond ... [but] we must
be guided by a commitment to do what works in the long run, not by what
makes us feel better in the short run.
        - Jean Chr├ętien, Prime Minister of Canada

reply via email to

[Prev in Thread] Current Thread [Next in Thread]