bug-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cvs server/client protocol inefficiency and privacy violations


From: Mark D. Baushke
Subject: Re: cvs server/client protocol inefficiency and privacy violations
Date: Sun, 28 Jan 2007 20:32:47 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bruno Haible <bruno@clisp.org> writes:

> Hi Mark,
> 
> > > If you wish to play with it, you may download a copy here:
> > 
> >   cvs -z3 -d:pserver:anonymous@cvs.savannah.nongnu.org:/sources/cvs co ccvs
> 
> Compiles fine on Linux/x86. But the first checkout (against savannah.gnu.org,
> which runs cvs 1.12.9) leads to an error:
> 
> $ cvs update ChangeLog 
> P ChangeLog
> cvs [update aborted]: No signature for `ChangeLog'.

Yes, The current change in behavior for the client to want to ask any
CVS server for the OpenPGP signatures of files by default may be a
controversial default. Derek and I have chatted about it more than once,
but other folks who have feedback please do feel free to add your
support to one camp or the other.

> I need to use the --no-verify option to get it going.

Yes, that would be needful if you are connecting to an older CVS server
which does not support the new protocol elements.

> (Btw, the documentation of the CVS_VERIFY_CHECKOUTS environment variable
> in appendix D does not say what are the possible values for this variable.
> Should I do
>    CVS_VERIFY_CHECKOUTS=
> or
>    CVS_VERIFY_CHECKOUTS=no
> or
>    CVS_VERIFY_CHECKOUTS=--no-verify
> or what?

For myself, I use CVS_VERIFY_CHECKOUTS=warn so that I get warnings when
using a CVS server which does not support OpenPGP signatures. I also
tend to believe this should be the default behavior for now, but I
didn't write the new feature, so Derek gets to choose the defaults.

> IMO the doc of each environment variable should state
>   1. what the variable is good for (general description),

The CVS_VERIFY_CHECKOUTS environment variables is there to provide an
environmental way to specify the various --verify=[off|warn|fatal]
command-line switches to be applied by default to all invocations of the
new CVS client.

>   2. what are the possible values for the variable, and their respective
>      effects,
>   3. what is the default value.)

The default is --verify=fatal when the environment variable is NOT set.

As to possible values, here you go...

Looking at the sources, it seems that the following values are
available:

    VALUE           MEANING
    warn            generate a warning if verify checkouts do not pass
                    (--verify=warn)

    fatal           generate a failure if verify checkouts do not pass
    on              generate a failure if verify checkouts do not pass
    yes             generate a failure if verify checkouts do not pass
    true            generate a failure if verify checkouts do not pass
    1               generate a failure if verify checkouts do not pass
                    --verify=fatal
    
    off             do not try to verify checkouts (--no-verify | --verify=off)
    no              do not try to verify checkouts (--no-verify | --verify=off)
    false           do not try to verify checkouts (--no-verify | --verify=off)
    0               do not try to verify checkouts (--no-verify | --verify=off)

The 'cvs --help-options' gives a hint of these possible values here:

    --verify[=(off | warn | fatal)] | --no-verify
                 Force (or forbid) OpenPGP signature verification
                 on checkout (default warns on failure).

However, I agree that it is not as explicit as it needs to be.

The lack of documentation for the CVS_VERIFY_CHECKOUTS environment
variable needs to be addressed, thank you for the report. It would be
good to get this fixed before we release cvs 1.12.14.

> > I would be interested in learning the differences in timings you may
> > find.
> 
> There are none. Still the same ratio of 6 sec. versus 29 sec.:
> 
...elided...

I would expect no changes unless you connect to a server that
understands the new Base-* protocol operations which have only been
introduced in the CVS trunk (1.12.13.1). So, to really run a useful test
you would need to have both a client and a server. Otherwise, you are
only going to get the backward compatible behaviors.

> Look at this too: ChangeLog.modified is not in the CVS, but the cvs program
> needs 29 seconds to detect this:
> 
> $ time cvs --no-verify update ChangeLog.modified 
> cvs update: use `cvs add' to create an entry for `ChangeLog.modified'
> 
> real    0m29.219s
> ...
> $ time cvs --no-verify status ChangeLog.modified
> cvs status: use `cvs add' to create an entry for `ChangeLog.modified'
> ===================================================================
> File: ChangeLog.modified        Status: Unknown
> 
>    Working revision:    No entry for ChangeLog.modified
>    Repository revision: No revision control file
> 
> 
> real    0m29.202s
> ...
> 
> This means that before the cvs client asks the cvs server whether the file
> exists at all on the server side, it sends the complete locally created
> file to the server!!

Yes, that does appear to be the case. To be honest, I was not aware of
that side-effect of the 'cvs status' command.

If you wish to see what is actually being transmitted to the server, use
the CVS_CLIENT_LOG environment variable set to a name like
/tmp/my-cvs-test and after the cvs command is executed the files
/tmp/my-cvs-test.in (data sent to the server) and /tmp/my-cvs-test.out
(data returned from the server) will have been created.

> > Yes, CVS 1.11.x and versions through CVS 1.12.13 will uploading the
> > entire locally modified file to the server and do the diff there.
> 
> And what about privacy? Did I give "cvs" permissions to send files from
> my hard disk over the network? 

This seems to be what the 'cvs status' command assumes. The 'cvs update'
command does not send hello.txt to the server, nor does the 'cvs commit'
command unless you have done a 'cvs add ChangeLog.modified' first (which
does not actually send the contents of the file to the server).

> No, I asked cvs to *pull* updates from the server to my hard disk.

I thought you used the 'cvs status' command? The 'cvs update' command
will not have sent ChangeLog.modifed to the server.

> There is no relation of trust between me and the admins of those
> machines from which I checked out a copy of some source code
> (especially anonymous checkouts).

Feel free to open a bug report against the 'cvs status' command.

> > It assumes that the CVS server has better provisioned resources for doing
> > the merge.
> 
> Hah! For two years the CVS operations with sourceforge.net have been
> becoming painfully slow. I was assuming that the hardware at
> sourceforge.net was not sufficiently dimensioned. This may certainly
> be a factor. But the other factor is that the "cvs" programs
> voluntarily moves the load to the server!!

True.

> Sadly, my conclusion for today is that the cvs client/server protocol
> needs a complete redesign if it wants to
>   - match today's expectations regarding privacy,

Yes, changes to the 'cvs status' command are desirable.

I don't think any other cvs subcommand violates your expectations, but
you should feel free to open a bug tracker report on
savannah.nongnu.org/projects/cvs/

>   - minimize network bandwidth, esp. uploads,

There may be those who might like to specify which direction needs to be
minimized for those folks operating cvs servers on the end of an ADSL
pipe from their home. However, CVS is still a lot less bandwidth than
some of the 'newer' protocols out there. It was designed at the time
when a 56K link was a big deal and a T1 was ruinously expensive.

>   - minimize server load.

Yes, I have seen this request more than once from sites that choose to
host thousands of CVS repositories or single repositories with more than
a few hundred thousand files and more than a few million revisions.

You are not the only person who has expressed disappointment in CVS.

In fact, more than one opensource SCM system/project exists to replace
CVS (and/or CVSNT).

   - GNU arch
   - OpenCVS (the OpenBSD project to avoid the security concerns
     and GPL issues they do not like when using CVS).
   - git
   - monotone
   - svn

Each of them has good points and bad points. Some may be better for
small projects than large projects and the converse true for others.

For now, there are a lot of legacy users using CVS and it is desirable
to do what we can to improve it. Sadly, the nature of maintaining a
large legacy installed base in a compatible and portable manner means
that a complete redesign is not as likely as trying to introduce gradual
incremental improvements.

If you are interested in contributing suggestions for how to improve the
CVS client/server protocol, I would love to see them.

I hope you found the information above to be useful.

        -- Mark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (FreeBSD)

iD8DBQFFvXjvCg7APGsDnFERAt/LAKCz4ImdyUVb6YXfmw+ZYs7A9Ur5ogCdG5Iv
n8Qi+R+PgoblWv6I4UHS4wM=
=Br99
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]