monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: nvm.ssh-agent review


From: hendrik
Subject: Re: [Monotone-devel] Re: nvm.ssh-agent review
Date: Wed, 14 Feb 2007 09:34:36 -0500
User-agent: Mutt/1.5.9i

On Tue, Feb 13, 2007 at 08:42:20PM -0800, Justin Patrin wrote:
>
> Well, the problem is that the packet construction is recursive. You
> need the length of the entire key, for example, before inserting it
> into the packet data, then the length of all of that data before
> inserting it into the packet itself. It's possible that stringstreams
> could be better but what is there works now. I'd suggest if we want to
> rewrite this we do it after this branch lands.

I once used a framing protocol that prefixed items with their lengths.  
Items could contain other items (but a higher level protocol was needed 
to determine whether they actually did).

Marshalling the items for writing was a two-pass process.  But the 
first pass did not need to do process the text to be except to determine 
its length.  *All* the first pass did was generate a tree of 
lengths, so that the second pass could do the actual writing.

To make things interesting, the length encoding was itself recursive, to 
permit unbounded lengths -- you might have to reimplement when your 
implementation limits on storage capacity or word-length increased, but 
you wouldn't have to recode any existing data.

This scheme we called VLE, for Varying Length Encoding, and was designed 
as a compromise between the encoding of short and long strings.  
single ASCII characters, for example could be encoded as themselves, 
without extra overhead.

In case you're interested here's how nonnegative integers were encoded:

Up to seven bits:
        0 x x x x x x x

Up to 13 or 21 bits:
        1 0 L x x x x x  x x x x x x x x  ...
        L=0: two bytes containing 13 bits
        L=1: three bytes containing 21 bits

More bytes:
        1 1 0 L L x x x  x x x x x x x x ...
        LL + 4 bytes containing 8 * (L + 4) - 5 bits

Still more bytes:
        1 1 1 0 L L L L  x x x x x x x x ...
        LLLL + 22 bytes containing (L - 1) * 8 bits

Recursively many bytes:

        1 1 1 1 0 0 0 0  <LLLLL...., encoded in VLE>  <LLLLL.... bytes of bits>

We also used 1 1 1 1 E E E E with bit combinations other than 0 0 0 0 
as escape codes for higher-level protocols.

Byte streams were encoded with the shortest string that could accomodate 
them (with the suppression of leading zero bits in the 
first byte).  There was some convention for the case of the first byte 
being all zero.

We found this hierarchical framing mechanism quite useful.

Its big drawback for these latter days is that texually-oriented 
revision-control systems tend to break the string-lengths during 
merging, rendering the data quite useless.  If a VLE string of length L 
has had a single bytes inserted in different places in two revisions, 
merging the revisions tends to result in a length code of L+1, though 
the string is now of length L+2.

-- hendrik





reply via email to

[Prev in Thread] Current Thread [Next in Thread]