info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: To get some information regarding CVS


From: Mark D. Baushke
Subject: Re: To get some information regarding CVS
Date: Mon, 22 Sep 2008 08:53:48 -0700

Arthur Barrett <address@hidden> writes:

> > If there are good grammar definitions for mergepoint1, bugid,
> > permissions and username, then it would be great if they could be
> > shared.
> 
> Do you have an example of how these are usually expressed?

Use the command 'man rcsfile'

For commitid, I used

                      { commitid id; }

and the id itself used in CVS is a base62 encoded text. It would have
been more efficient to use a base64 encoded value for the id and put it
into a string like @id@, but CVSNT was already using it as a simple id,
so that is what we had to to in CVS to be interoperable.

I have appended the rcsfile man page for you after my .signature as an
example. This is based on the final patch submitted in February 2006
(the first patch I had submitted was in September 2005).

The CVS executable generates a slightly longer commitid than 16 bytes
used by CVSNT. CVS uses the current binary time concatenated with some
random bytes to get around simultaneous commits on the same cluser
happening at the time time... on a heavily loaded system it is possible
for the smaller commitid to be a duplicate. By using a timestamp plus a
source of random bits, this is much less likely for CVS. Although, I
will grant that it makes it harder to use the commitid as a label that
users type.

Details:

This is the code in src/main.c we use to genreate the global session ID
used for the commitid in RCS transactions for this session:

/* ... */
enum {RANDOM_BYTES = 8};
enum {COMMITID_RAW_SIZE = (sizeof(time_t) + RANDOM_BYTES)};

static char const alphabet[62] =
  "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

/* Divide BUF by D, returning the remainder.  Replace BUF by the
   quotient.  BUF[0] is the most significant part of BUF.
   D must not exceed UINT_MAX >> CHAR_BIT.  */
static unsigned int
divide_by (unsigned char buf[COMMITID_RAW_SIZE], unsigned int d)
{
    unsigned int carry = 0;
    int i;
    for (i = 0; i < COMMITID_RAW_SIZE; i++)
    {
        unsigned int byte = buf[i];
        unsigned int dividend = (carry << CHAR_BIT) + byte;
        buf[i] = dividend / d;
        carry = dividend % d;
    }
    return carry;
}
static void
convert (char const input[COMMITID_RAW_SIZE], char *output)
{
    static char const zero[COMMITID_RAW_SIZE] = { 0, };
    unsigned char buf[COMMITID_RAW_SIZE];
    size_t o = 0;
    memcpy (buf, input, COMMITID_RAW_SIZE);
    while (memcmp (buf, zero, COMMITID_RAW_SIZE) != 0)
        output[o++] = alphabet[divide_by (buf, sizeof alphabet)];
    if (! o)
        output[o++] = '0';
    output[o] = '\0';
}
/* ... */
    /* Calculate the cvs global session ID */

    {
        char buf[COMMITID_RAW_SIZE] = { 0, };
        char out[COMMITID_RAW_SIZE * 2];
        ssize_t len = 0;
        time_t rightnow = time (NULL);
        char *startrand = buf + sizeof (time_t);
        unsigned char *p = (unsigned char *) startrand;
        size_t randbytes = RANDOM_BYTES;
        int flags = O_RDONLY;
        int fd;
#ifdef O_NOCTTY
        flags |= O_NOCTTY;
#endif
        if (rightnow != (time_t)-1)
                while (rightnow > 0) {
                    *--p = rightnow % (UCHAR_MAX + 1);
                    rightnow /= UCHAR_MAX + 1;
                }
        else {
            /* try to use more random data */
            randbytes = COMMITID_RAW_SIZE;
            startrand = buf;
        }
        fd = open ("/dev/urandom", flags);
        if (fd >= 0) {
            len = read (fd, startrand, randbytes);
            close (fd);
        }
        if (len <= 0) {
            /* no random data was available so use pid */
            long int pid = (long int)getpid ();
            p = (unsigned char *) (startrand + sizeof (pid));
            while (pid > 0) {
                *--p = pid % (UCHAR_MAX + 1);
                pid /= UCHAR_MAX + 1;
            }
        }
        convert(buf, out);
        global_session_id = xstrdup (out);
    }
/* ... */

The CHAR_BIT macro is presumed to be set by <limits.h> and is often 8.
The UCHAR_MAX macro is presumed to be set by <limits.h> and is often 255.

        Enjoy!
        -- Mark

$ man rcsfile | ul -tdumb
RCSFILE(5)                                                          RCSFILE(5)



NAME
       rcsfile - format of RCS file

DESCRIPTION
       An RCS file's contents are described by the grammar below.

       The  text is free format: space, backspace, tab, newline, vertical tab,
       form feed, and carriage return (collectively, white space) have no sig-
       nificance except in strings.  However, white space cannot appear within
       an id, num, or sym, and an RCS file must end with a newline.

       Strings are enclosed by @.  If a string contains a @, it must  be  dou-
       bled; otherwise, strings can contain arbitrary binary data.

       The  meta  syntax  uses  the following conventions: `|' (bar) separates
       alternatives; `{' and  `}'  enclose  optional  phrases;  `{'  and  `}*'
       enclose  phrases  that can be repeated zero or more times; `{' and '}+'
       enclose phrases that must appear at least once  and  can  be  repeated;
       Terminal symbols are in boldface; nonterminal symbols are in italics.

       rcstext   ::=  admin {delta}* desc {deltatext}*

       admin     ::=  head       {num};
                      { branch   {num}; }
                      access     {id}*;
                      symbols    {sym : num}*;
                      locks      {id : num}*;  {strict  ;}
                      { comment  {string}; }
                      { expand   {string}; }
                      { newphrase }*

       delta     ::=  num
                      date       num;
                      author     id;
                      state      {id};
                      branches   {num}*;
                      next       {num};
                      { commitid id; }
                      { newphrase }*

       desc      ::=  desc       string

       deltatext ::=  num
                      log        string
                      { newphrase }*
                      text       string

       num       ::=  {digit | .}+

       digit     ::=  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

       id        ::=  {num} idchar {idchar | num}*

       sym       ::=  {digit}* idchar {idchar | digit}*

       idchar    ::=  any visible graphic character except special

       special   ::=  $ | , | . | : | ; | @

       string    ::=  @{any character, with @ doubled}*@

       newphrase ::=  id word* ;

       word      ::=  id | num | string | :

       Identifiers  are case sensitive.  Keywords are in lower case only.  The
       sets of keywords and identifiers can overlap.  In most environments RCS
       uses  the  ISO  8859/1  encoding:  visible graphic characters are codes
       041-176 and 240-377, and white space characters are codes  010-015  and
       040.

       Dates,   which   appear  after  the  date  keyword,  are  of  the  form
       Y.mm.dd.hh.mm.ss, where Y is the year, mm the month (01-12), dd the day
       (01-31),  hh the hour (00-23), mm the minute (00-59), and ss the second
       (00-60).  Y contains just the last two digits of  the  year  for  years
       from  1900 through 1999, and all the digits of years thereafter.  Dates
       use the Gregorian calendar; times use UTC.

       The commitid is followed by an id token. This token is intended  to  be
       unique across multiple files and is used to help group files as being a
       part of the same logical commit.  This token must uniquely identify the
       commit  operation  that was applied to a set of RCS files.  In particu-
       lar, it must be unique among all the commitids in this file.

       The newphrase productions in the grammar are reserved for future exten-
       sions  to  the  format  of RCS files.  No newphrase will begin with any
       keyword already in use.

       The delta nodes form a tree.  All nodes whose numbers consist of a sin-
       gle  pair (e.g., 2.3, 2.1, 1.3, etc.)  are on the trunk, and are linked
       through the next field in order of decreasing numbers.  The head  field
       in  the  admin node points to the head of that sequence (i.e., contains
       the highest pair).  The branch node in the  admin  node  indicates  the
       default  branch  (or  revision) for most RCS operations.  If empty, the
       default branch is the highest branch on the trunk.

       All delta nodes whose  numbers  consist  of  2n  fields  (n>=2)  (e.g.,
       3.1.1.1,  2.1.2.2, etc.)  are linked as follows.  All nodes whose first
       2n-1 number fields are identical are linked through the next  field  in
       order  of  increasing  numbers.  For each such sequence, the delta node
       whose number is identical to the first 2n-2 number fields of the deltas
       on  that  sequence  is called the branchpoint.  The branches field of a
       node contains a list of the numbers of the first nodes of all sequences
       for which it is a branchpoint.  This list is ordered in increasing num-
       bers.

       The following diagram shows an example of an RCS file's organization.

                                  Head
                                    |
                                    |
                                    v                        / \
                                ---------                   /   \
          / \          / \      |       |      / \         /     \
         /   \        /   \     |  2.1  |     /   \       /       \
        /     \      /     \    |       |    /     \     /         \
       /1.2.1.3\    /1.3.1.1\   |       |   /1.2.2.2\   /1.2.2.1.1.1\
       ---------    ---------   ---------   ---------   -------------
           ^            ^           |           ^             ^
           |            |           |           |             |
           |            |           v           |             |
          / \           |       ---------      / \            |
         /   \          |       \  1.3  /     /   \           |
        /     \         ---------\     /     /     \-----------
       /1.2.1.1\                  \   /     /1.2.2.1\
       ---------                   \ /      ---------
           ^                        |           ^
           |                        |           |
           |                        v           |
           |                    ---------       |
           |                    \  1.2  /       |
           ----------------------\     /---------
                                  \   /
                                   \ /
                                    |
                                    |
                                    v
                                ---------
                                \  1.1  /
                                 \     /
                                  \   /
                                   \ /



IDENTIFICATION
       Author: Walter F. Tichy, Purdue University, West Lafayette, IN,  47907.
       Manual Page Revision: 5.6; Release Date: 1995/06/05.
       Copyright (C) 1982, 1988, 1989 Walter F. Tichy.
       Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995 Paul Eggert.

SEE ALSO
       rcsintro(1),  ci(1),  co(1), ident(1), rcs(1), rcsclean(1), rcsdiff(1),
       rcsmerge(1), rlog(1)
       Walter F. Tichy, RCS--A System for Version Control,  Software--Practice
       & Experience 15, 7 (July 1985), 637-654.



GNU                               1995/06/05                        RCSFILE(5)
$ 

Attachment: pgp4UtaJa8xv4.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]