[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
invalid change text again
From: |
Benjamin Dodge |
Subject: |
invalid change text again |
Date: |
Mon, 26 Feb 2001 16:09:27 -0800 |
User-agent: |
Mutt/1.2i |
Hi,
I've experience a couple of corrupted RCS files resulting in 'invalid
change text errors'. I've looked around the web and in this archive.
I've not really seen good reasons as to why it happens, so I did some
testing and investigations. Here's what I've found.
A. No access method is safe.
1. People like to blaim this on locking in NFS. I've found that
corruption with NFS is worse than other kinds of corruption. However,
I don't think NFS needs to be labeled unsafe (more on this later). The
basic failure happens when two or more users are attempting to change
a file. Both think they have write access to the file due to the
flawed locking mechanism. The file has invalid data.
2. rsh (ext) and pserver are actually not much better than NFS. Two
users can STILL get a lock on an RCS file. I have a test script that
can recreate this by having one machine constantly commit a file while
other machines try to tag the latest rev of the file (they update
before tagging). Both operations (commit and tag) require exclusive
write access to the RCS file. In this case, one server is modifying to
file so the corruption seems to just drop a revision or two. That's
not too bad.
B. The problem seems to be in the cvs code that does file locking.
1. cvs uses 'SIG_beginCrSect' and 'SIG_endCrSect' to make lock files.
These functions turn off the sensitivity to SIGNALS (INT, TERM,...)
between the begin and end calls. This does not work. The basic
algoritm is to:
SIG_beginCrSect()
create_some_dir_or_file
SIG_endCrSect()
However, the OS can still context switch the process out especially during
the file i/o operations. Being in this sort of critical section does not
seem to constitute true mutual exclusion.
fcntl seems to be a better way to do locks. Why is it not used? It is NFS
safe (when lockd is running on the clients).
2. I've decided to put a wrapper script for the CVS_SERVER. This
script uses fcntl to set a system wide lock (safe on NFS too) on all
repositories. The performance is bad, but at least we should not
experience corruption again. With a lot more work (snooping the client
server commands), you could do locking on only the repository or file being
operated on and even use RD and WR locks. However, doing a global WR lock
is easiest and safest.
3. Using pserver without spawning multiple cvs processes should avoid
corruption by only having one cvs command run at a time. This creates
performance issues.
Am I wrong? Is it possible to run CVS in client/server without ever getting
RCS file corruption?
Comments,
Benjamin
- invalid change text again,
Benjamin Dodge <=