bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: memxor


From: Nikos Mavrogiannopoulos
Subject: Re: memxor
Date: Tue, 12 Apr 2011 09:53:33 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8

On 04/12/2011 02:29 AM, Bruno Haible wrote:

>> As an unrelated suggestion for memxor.c I had a discussion with 
>> another project about that and we concluded on a gmp-based version
>> that is orders of magnitude faster. It would be nice if gnulib
>> included the optimized code by default.
> If you mean gmp's mpn_xor function, written in assembly language: It
> would be a bad idea to include assembly language code in gnulib, 
> because the maintenance cost of such code is very high. Assembly
> language source code depends on

Hello,
 I meant the linked memxor implementation[0]. It is plain C code, and
XOR was being done per CPU word, not per byte.

[0].
http://cvs.lysator.liu.se/viewcvs/viewcvs.cgi/lsh/nettle/memxor.c?rev=1.4&root=lsh&view=auto

There are two functions there. memxor() is the optimized version
of your memxor() and memxor3() is a function that memxors
data from two addresses storing them in a third one.

[...]
> - various alignments.
It handles any misalignment as well.

> Additionally, memxor is not speed critical, AFAIK. Can you tell me
> one program which spends more than 20% of its runtime in memxor?

I had 10% speed-ups in a web server that used gnutls with the optimized
version of memxor. That is because CBC encryption mode uses XOR heavily.
10% is enormous speed-up considering that this is a very small part of
the encryption process.

> memxor implements the addition in the GF(2^q) field, when using a
> vector space representation of the elements. But with this
> representation, the multiplication is much more costly and therefore
> in practice overshadows the time spent in memxor. And with the other,
> logarithmic, representation of elements of GF(2^q), the
> multiplication is fast - a simple addition - and the addition is a
> table lookup. So, no memxor in this case at all.

This does not relate to my use-case. I only need memxor.

> In summary, I find it pointless to optimize memxor in isolation. 
> (Whereas it makes sense to optimize memxor if all the other
> arithmetic operations are also optimized.)

The usage of memxor() is indeed exceptional, and I was surprised I found
it in gnulib. However if you have it there, I little point in having it
as a for loop that xors byte by byte.


regards,
Nikos



reply via email to

[Prev in Thread] Current Thread [Next in Thread]