info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Locking (was Re: CVS Transaction)


From: Paul Sander
Subject: Re: Locking (was Re: CVS Transaction)
Date: Wed, 7 Sep 2005 18:40:07 -0700

Following up to my own post...

On Sep 7, 2005, at 5:47 PM, Paul Sander wrote:


On Sep 7, 2005, at 4:33 PM, Pierre Asselin wrote:

Mateusz [PEYN] Adamus <address@hidden> wrote:

Is there in CVS something like DB transactions?

Fore example:
1. login to cvs
2. start transaction
3. create branch
4. do something on this branch
5. if 4 = OK then commit transaction (branch stays)
    else rollback tran (delete branch)

I don't see how you could do implement that without taking a write
lock on the entire module ... which would defeat the "C" in "CVS".

When I need to be careful about races, I generally find it sufficient
to plant temporary tags from a working sandbox ("cvs tag", not "cvs
rtag").  The tag operation is not atomic, but it produces well-defined
results because the revision numbers for the tags are taken from
the sandbox.

It can be done with intent mode locks and some simple manipulation of the RCS files. Intent mode locks allow concurrent access by readers while copies of the RCS files are updated. (RCS does this anyway, so all you really need to do is create a hard link and use stock RCS on the link.) After all the RCS files have been updated, promote the intent mode lock to a write lock and wait for all readers to finish. Finally, rename the updated copies of the RCS files on top of the originals and clear the lock.

Actually, you might not need to wait for the readers to finish, if the revision numbers are known in advance as version numbers read from the sandbox. It also works with branch/timestamp pairs if the timestamp is earlier than the creation time of the intent lock. See discussion of race conditions below. Slipping the timestamp parameter back to just before the lock's creation time forces this restriction and allows the user to continue and it delivers a correct configuration.

When updating tags, readers must wait for writers because the version numbers are not known in advance. In other words, users running "cvs update -r sometag" must wait for "cvs tag sometag" or "cvs rtag sometag" to complete.

This actually increases concurrency over the current directory-based locking because the exclusive lock exists only while the RCS files are renamed, not while they're updated. It's a much smaller window timewise and usually smaller numbers of files are affected.

It also enables a transaction model because you can make multiple changes to the RCS files while holding the intent mode locks, and you can abort the entire change by removing the updated RCS files (rather than renaming them) and clearing the intent mode lock (rather than promoting it to a write lock).

This maintains a race condition with regard to fetching versions by timestamp. You can get a different set of files by issuing the same command twice, if one of the commands is given while an intent lock or write lock exists, using a timestamp that's newer than the lock. In this situation, the second command would get a subset of the changes.

On the other hand, it can be partially fixed by recording the intent mode lock's creation time in the modified RCS files, rather than the current time. So although a race condition remains, at least you never get back a partial commit.

The bad news is that to do this, the existing locking system must be replaced. The read/intent/write locking would then be done on module/branch pairs (or module/branch/tag triples if tags are updated) at a global scale and would be relatively quick. The hard-linked RCS files are a kind of exclusive locking that, which could cause wait conditions when some RCS files are updated concurrently on different branches. But they're fast in the absence of conflicts, and the order doesn't matter as long as the branch and tag locks are correct (which means that processing can continue while waiting for a retry).

Actually, deadlock avoidance or recovery is required. Several methods are possible, but ordering is probably easiest.

Although there may be a need for multiple transactions to updated a single RCS file, the changes would be on different branches. (An intent mode lock on a branch is a global resource and prevents concurrent changes to the same branch globally.) That means that a weaker locking mechanism can be used as long as aborted changes don't end up in the final copy. This can be tricky, though doable, but probably at a cost that's not worth paying.

Such a locking method also opens up possibilities for additional capabilities. For example, logging the hard links enables a method for crash recovery. It works like this: If the write lock exists, complete the renames of the files listed in the log, clear the lock, and remove the log. If the intent lock exists, remove the hard links, clear the lock, and remove the log. If read locks exist, clear them.

This locking system is a scalable implementation that eliminates bottlenecks of the current directory-based mechanism. It allows the simultaneous and inexpensive locking of files that are widespread but sparse in the repository. Such a quality of the locking system is an enabling factor for other features that are also important to some users, which I won't go into at this time.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]