info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

invalid change text again


From: Benjamin Dodge
Subject: invalid change text again
Date: Mon, 26 Feb 2001 16:09:27 -0800
User-agent: Mutt/1.2i

Hi,

I've experience a couple of corrupted RCS files resulting in 'invalid
change text errors'. I've looked around the web and in this archive.
I've not really seen good reasons as to why it happens, so I did some
testing and investigations. Here's what I've found.

A. No access method is safe.
  1. People like to blaim this on locking in NFS. I've found that
  corruption with NFS is worse than other kinds of corruption. However,
  I don't think NFS needs to be labeled unsafe (more on this later). The
  basic failure happens when two or more users are attempting to change
  a file. Both think they have write access to the file due to the
  flawed locking mechanism. The file has invalid data.

  2. rsh (ext) and pserver are actually not much better than NFS. Two
  users can STILL get a lock on an RCS file. I have a test script that
  can recreate this by having one machine constantly commit a file while
  other machines try to tag the latest rev of the file (they update
  before tagging). Both operations (commit and tag) require exclusive
  write access to the RCS file. In this case, one server is modifying to
  file so the corruption seems to just drop a revision or two. That's
  not too bad.

B. The problem seems to be in the cvs code that does file locking.
  1. cvs uses 'SIG_beginCrSect' and 'SIG_endCrSect' to make lock files.
  These functions turn off the sensitivity to SIGNALS (INT, TERM,...)
  between the begin and end calls. This does not work. The basic
  algoritm is to:
  SIG_beginCrSect()
  create_some_dir_or_file
  SIG_endCrSect()

  However, the OS can still context switch the process out especially during
  the file i/o operations. Being in this sort of critical section does not
  seem to constitute true mutual exclusion.

  fcntl seems to be a better way to do locks. Why is it not used? It is NFS
  safe (when lockd is running on the clients).

  2. I've decided to put a wrapper script for the CVS_SERVER. This
  script uses fcntl to set a system wide lock (safe on NFS too) on all
  repositories. The performance is bad, but at least we should not
  experience corruption again. With a lot more work (snooping the client
  server commands), you could do locking on only the repository or file being
  operated on and even use RD and WR locks. However, doing a global WR lock
  is easiest and safest.

  3. Using pserver without spawning multiple cvs processes should avoid
  corruption by only having one cvs command run at a time. This creates
  performance issues.


Am I wrong? Is it possible to run CVS in client/server without ever getting
RCS file corruption?

Comments,
Benjamin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]