[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: To get some information regarding CVS
From: |
Mark D. Baushke |
Subject: |
Re: To get some information regarding CVS |
Date: |
Mon, 22 Sep 2008 08:53:48 -0700 |
Arthur Barrett <address@hidden> writes:
> > If there are good grammar definitions for mergepoint1, bugid,
> > permissions and username, then it would be great if they could be
> > shared.
>
> Do you have an example of how these are usually expressed?
Use the command 'man rcsfile'
For commitid, I used
{ commitid id; }
and the id itself used in CVS is a base62 encoded text. It would have
been more efficient to use a base64 encoded value for the id and put it
into a string like @id@, but CVSNT was already using it as a simple id,
so that is what we had to to in CVS to be interoperable.
I have appended the rcsfile man page for you after my .signature as an
example. This is based on the final patch submitted in February 2006
(the first patch I had submitted was in September 2005).
The CVS executable generates a slightly longer commitid than 16 bytes
used by CVSNT. CVS uses the current binary time concatenated with some
random bytes to get around simultaneous commits on the same cluser
happening at the time time... on a heavily loaded system it is possible
for the smaller commitid to be a duplicate. By using a timestamp plus a
source of random bits, this is much less likely for CVS. Although, I
will grant that it makes it harder to use the commitid as a label that
users type.
Details:
This is the code in src/main.c we use to genreate the global session ID
used for the commitid in RCS transactions for this session:
/* ... */
enum {RANDOM_BYTES = 8};
enum {COMMITID_RAW_SIZE = (sizeof(time_t) + RANDOM_BYTES)};
static char const alphabet[62] =
"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
/* Divide BUF by D, returning the remainder. Replace BUF by the
quotient. BUF[0] is the most significant part of BUF.
D must not exceed UINT_MAX >> CHAR_BIT. */
static unsigned int
divide_by (unsigned char buf[COMMITID_RAW_SIZE], unsigned int d)
{
unsigned int carry = 0;
int i;
for (i = 0; i < COMMITID_RAW_SIZE; i++)
{
unsigned int byte = buf[i];
unsigned int dividend = (carry << CHAR_BIT) + byte;
buf[i] = dividend / d;
carry = dividend % d;
}
return carry;
}
static void
convert (char const input[COMMITID_RAW_SIZE], char *output)
{
static char const zero[COMMITID_RAW_SIZE] = { 0, };
unsigned char buf[COMMITID_RAW_SIZE];
size_t o = 0;
memcpy (buf, input, COMMITID_RAW_SIZE);
while (memcmp (buf, zero, COMMITID_RAW_SIZE) != 0)
output[o++] = alphabet[divide_by (buf, sizeof alphabet)];
if (! o)
output[o++] = '0';
output[o] = '\0';
}
/* ... */
/* Calculate the cvs global session ID */
{
char buf[COMMITID_RAW_SIZE] = { 0, };
char out[COMMITID_RAW_SIZE * 2];
ssize_t len = 0;
time_t rightnow = time (NULL);
char *startrand = buf + sizeof (time_t);
unsigned char *p = (unsigned char *) startrand;
size_t randbytes = RANDOM_BYTES;
int flags = O_RDONLY;
int fd;
#ifdef O_NOCTTY
flags |= O_NOCTTY;
#endif
if (rightnow != (time_t)-1)
while (rightnow > 0) {
*--p = rightnow % (UCHAR_MAX + 1);
rightnow /= UCHAR_MAX + 1;
}
else {
/* try to use more random data */
randbytes = COMMITID_RAW_SIZE;
startrand = buf;
}
fd = open ("/dev/urandom", flags);
if (fd >= 0) {
len = read (fd, startrand, randbytes);
close (fd);
}
if (len <= 0) {
/* no random data was available so use pid */
long int pid = (long int)getpid ();
p = (unsigned char *) (startrand + sizeof (pid));
while (pid > 0) {
*--p = pid % (UCHAR_MAX + 1);
pid /= UCHAR_MAX + 1;
}
}
convert(buf, out);
global_session_id = xstrdup (out);
}
/* ... */
The CHAR_BIT macro is presumed to be set by <limits.h> and is often 8.
The UCHAR_MAX macro is presumed to be set by <limits.h> and is often 255.
Enjoy!
-- Mark
$ man rcsfile | ul -tdumb
RCSFILE(5) RCSFILE(5)
NAME
rcsfile - format of RCS file
DESCRIPTION
An RCS file's contents are described by the grammar below.
The text is free format: space, backspace, tab, newline, vertical tab,
form feed, and carriage return (collectively, white space) have no sig-
nificance except in strings. However, white space cannot appear within
an id, num, or sym, and an RCS file must end with a newline.
Strings are enclosed by @. If a string contains a @, it must be dou-
bled; otherwise, strings can contain arbitrary binary data.
The meta syntax uses the following conventions: `|' (bar) separates
alternatives; `{' and `}' enclose optional phrases; `{' and `}*'
enclose phrases that can be repeated zero or more times; `{' and '}+'
enclose phrases that must appear at least once and can be repeated;
Terminal symbols are in boldface; nonterminal symbols are in italics.
rcstext ::= admin {delta}* desc {deltatext}*
admin ::= head {num};
{ branch {num}; }
access {id}*;
symbols {sym : num}*;
locks {id : num}*; {strict ;}
{ comment {string}; }
{ expand {string}; }
{ newphrase }*
delta ::= num
date num;
author id;
state {id};
branches {num}*;
next {num};
{ commitid id; }
{ newphrase }*
desc ::= desc string
deltatext ::= num
log string
{ newphrase }*
text string
num ::= {digit | .}+
digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
id ::= {num} idchar {idchar | num}*
sym ::= {digit}* idchar {idchar | digit}*
idchar ::= any visible graphic character except special
special ::= $ | , | . | : | ; | @
string ::= @{any character, with @ doubled}*@
newphrase ::= id word* ;
word ::= id | num | string | :
Identifiers are case sensitive. Keywords are in lower case only. The
sets of keywords and identifiers can overlap. In most environments RCS
uses the ISO 8859/1 encoding: visible graphic characters are codes
041-176 and 240-377, and white space characters are codes 010-015 and
040.
Dates, which appear after the date keyword, are of the form
Y.mm.dd.hh.mm.ss, where Y is the year, mm the month (01-12), dd the day
(01-31), hh the hour (00-23), mm the minute (00-59), and ss the second
(00-60). Y contains just the last two digits of the year for years
from 1900 through 1999, and all the digits of years thereafter. Dates
use the Gregorian calendar; times use UTC.
The commitid is followed by an id token. This token is intended to be
unique across multiple files and is used to help group files as being a
part of the same logical commit. This token must uniquely identify the
commit operation that was applied to a set of RCS files. In particu-
lar, it must be unique among all the commitids in this file.
The newphrase productions in the grammar are reserved for future exten-
sions to the format of RCS files. No newphrase will begin with any
keyword already in use.
The delta nodes form a tree. All nodes whose numbers consist of a sin-
gle pair (e.g., 2.3, 2.1, 1.3, etc.) are on the trunk, and are linked
through the next field in order of decreasing numbers. The head field
in the admin node points to the head of that sequence (i.e., contains
the highest pair). The branch node in the admin node indicates the
default branch (or revision) for most RCS operations. If empty, the
default branch is the highest branch on the trunk.
All delta nodes whose numbers consist of 2n fields (n>=2) (e.g.,
3.1.1.1, 2.1.2.2, etc.) are linked as follows. All nodes whose first
2n-1 number fields are identical are linked through the next field in
order of increasing numbers. For each such sequence, the delta node
whose number is identical to the first 2n-2 number fields of the deltas
on that sequence is called the branchpoint. The branches field of a
node contains a list of the numbers of the first nodes of all sequences
for which it is a branchpoint. This list is ordered in increasing num-
bers.
The following diagram shows an example of an RCS file's organization.
Head
|
|
v / \
--------- / \
/ \ / \ | | / \ / \
/ \ / \ | 2.1 | / \ / \
/ \ / \ | | / \ / \
/1.2.1.3\ /1.3.1.1\ | | /1.2.2.2\ /1.2.2.1.1.1\
--------- --------- --------- --------- -------------
^ ^ | ^ ^
| | | | |
| | v | |
/ \ | --------- / \ |
/ \ | \ 1.3 / / \ |
/ \ ---------\ / / \-----------
/1.2.1.1\ \ / /1.2.2.1\
--------- \ / ---------
^ | ^
| | |
| v |
| --------- |
| \ 1.2 / |
----------------------\ /---------
\ /
\ /
|
|
v
---------
\ 1.1 /
\ /
\ /
\ /
IDENTIFICATION
Author: Walter F. Tichy, Purdue University, West Lafayette, IN, 47907.
Manual Page Revision: 5.6; Release Date: 1995/06/05.
Copyright (C) 1982, 1988, 1989 Walter F. Tichy.
Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995 Paul Eggert.
SEE ALSO
rcsintro(1), ci(1), co(1), ident(1), rcs(1), rcsclean(1), rcsdiff(1),
rcsmerge(1), rlog(1)
Walter F. Tichy, RCS--A System for Version Control, Software--Practice
& Experience 15, 7 (July 1985), 637-654.
GNU 1995/06/05 RCSFILE(5)
$
pgp4UtaJa8xv4.pgp
Description: PGP signature