help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cfengine 1.x protocol inefficiencies


From: Mark . Burgess
Subject: Re: cfengine 1.x protocol inefficiencies
Date: Wed, 25 Oct 2000 22:33:54 +0200 (MET DST)

On 25 Oct, Gregory P. Smith wrote:
>> > (Side note: the cfengine protocol is not designed well for doing lots
>> > quickly; it sends lots of null padding in the protocol in both
>> > directions and is latency sensitive by always waiting for the response
>> > from one server operation before issuing another request.  Despite all
>> > this, it does get the job done, just not nearly as well as it could)
>> > 
>> > Greg
>> 
>> cfd 1.6.0 is extremely stable and the claims
>> of its inefficiency are slightly exaggerated I think...
>> 
>> Mark
> 
> For most people's use, Mark is right and the inefficiencies are hardly
> noticed.
> 
> You'll notice performance issues when syncing large directory trees.
> It makes it not worth using cfengine itself to keep big trees in sync
> across a large number of hosts other than to copy and extract an
> updated tar file [a very good trick] or spawn an rsync process.
> 
> the padding problem:  Every message to the server is padded to 4kb
> with zeros.  So if you're asking for the status of 2000 files, you're
> sending 8megs of data to the cfd server to do it, only ~80kb of which
> contained useful data (the rest was zeros).  Now multiply this by a
> good number of clients (lets say 400 in this example) doing this once
> an hour and you've got 3.2gigs of data being sent to your cfd server
> just to -ask- it about file timestamps.  In addition to this, all
> server responses are padded the same way so the server has to send out
> at least that much data in response.  This just about -saturates- a
> full duplex 10mbit/sec network connection on the server for such a
> small task.  Add to this traffic for actually getting updated files
> when some or many of them have changed and you can say goodbye to your
> network.
> 
> the non-pipelining/latency problem:  On a 10mbit/sec network the
> latency is about 8-9ms to send 4kb to a server and get a 4kb response
> from the server (its 1-2ms for a tiny amount of data).  The current
> client sends one request and waits for its response before sending the
> next one.  To stat 2000 files this way would take about 17 seconds
> with the padding and 3 seconds without.  If the requests were
> pipelined (ie: all stat requests were just sent in a row without
> waiting for the previous ones responses) the time for the stat would
> be bounded by the bandwidth between the client and server rather than
> the latency as well as taking advantage of full duplex communications
> and halving the stat time.  The stat time is not dramatic with only
> 2000 files, but multiply that by 10 and you see the problem...
> 
> These days many if not most sites have fast 100mbit/sec ethernet
> between the majority of their hosts.  Even so, cfengine uses a
> disproportionate amount of bandwidth and becomes much less useful over
> leased-lines/WANs (where latency is much higher) if a large number of
> file stats/copies are desired.
> 
> I hope that any work on cfengine 2.x will include a more efficient
> protocol.  If anything is done, at least stop padding messages and use
> much simpler length prefixed ones or newline terminated ones like the
> rest of the protocols in the world.  (I'm in favor of length prefixing
> everything myself; newline termination is asking for disaster when
> someone forgets to length check the data anywhere in the code)
> 
> Greg
> 

I have just been brushing up on the details of the protocol
used in current versions and as far as I can see there is
only one remaining source of ineffciency: when stat-ing files.

Of course, this is an operation which is performed for every file,
regardless of whether it needs to be copied. I shall look into
ways in which this can be improved, without breaking the 
other important things about the protocol. I don't suppose that it is
difficult, it's just not been a priority in my (these days)
impossible schedule

Another thing I intend to do is allow drop in replacements
for the actual transfers of data, using rsync, ssh of whatever.
The stat part will probably have to remain in cfengine.

Mark

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Work: +47 22453272            Email:  Mark.Burgess@iu.hio.no
Fax : +47 22453205            WWW  :  http://www.iu.hio.no/~mark
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~





reply via email to

[Prev in Thread] Current Thread [Next in Thread]