[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/4] sha1sum: use AF_ALG when available

From: Assaf Gordon
Subject: Re: [PATCH v2 1/4] sha1sum: use AF_ALG when available
Date: Wed, 25 Apr 2018 11:34:54 -0600
User-agent: NeoMutt/20170113 (1.7.2)

Hello Matteo,

On Wed, Apr 25, 2018 at 01:26:08PM +0200, Matteo Croce wrote:
> Linux supports accessing kernel crypto API via AF_ALG since
> version 2.6.38. Coreutils uses libcrypto when available and fallbacks to
> generic C implementation of various hashing functions.

Not exactly: coreutils (and every project which uses gnulib's md5/sha* modules)
defaults to using the generic C implementation.
gnulib automatically adds the option "--with-openssl" to "./configure",
which can be set explicitly to 'yes' or 'auto'.

To be conservative, I would suggest following the same method:
Add a "configure" flag instead of enabling a completely new interface
by default for every downstream project,
especially that it is tied to external components (the kernel / modules / etc).

If a distribution wants to enable by default, they can do it
for their package (and ensure it works well with their kernel).

> Use afalg_stream() only in sha1sum for now, but other hashes are possible.
> The speed gain really depends on the CPU type, on systems which doesn't use
> libcrypto ranges from ~10% to 320%.

It would be beneficial to know the improvements against coreutils+libcrypto
on the same CPUs, because if users build with "--openssl=yes" (which is 
considered the "fast" implementation), they would not benefit from your patch.
So they'll need to decide which one is preferable (and OpenSSL/libreSSL is 
on more CPUs and OSs, not just linux on a subset of CPUs).

> This is a test on a Intel(R) Xeon(R) CPU E3-1265L V2 and Debian stretch:
>     $ truncate -s 2GB 2g.bin
>     $ time sha1sum 2g.bin
>     752ef2367f479e79e4f0cded9c270c2890506ab0  2g.bin
>     real    0m4.829s
>     user    0m4.437s
>     sys     0m0.391s
>     $ time ./sha1sum-afalg 2g.bin
>     752ef2367f479e79e4f0cded9c270c2890506ab0  2g.bin
>     real    0m3.164s
>     user    0m0.000s
>     sys     0m3.162s

Regarding the above measurement:
On most modern filesystems "truncate -s" would create a sparse file,
requiring very little I/O.
And if the measurements above are indeed mostly CPU-bound
and not IO-bound, it indicates that hasing in the kernel
is only somewhat faster then in user-mode, but not significantly so
if you exclude I/O (not as dramatic difference as measured in
your previous email).

This also brings the question: in the previous emails you report
the improved *-afalg versions after the default versions - did
you clear the cache to exclude I/O effects?
(e.g. "sync; echo 3 > /proc/sys/vm/drop_caches")

> +    {
> +      /* sendfile() not possible, do a classic read-write loop */
> +      while ((size = fread (buf, 1, sizeof (buf), stream)))
> +        {
> +          if (send (ofd, buf, size, size == sizeof (buf) ? MSG_MORE : 0) == 
> -1)
> +            {

Few comments:
1. In the above loop, wouldn't a file that's an exact multiple of BLOCKSIZE
never be sent 0 as the 4th paramater?
And if so, assuming hashing still works because "MSG_MORE" is just a hint,
why not always send MSG_MORE ?

2. The code for sha*_stream() functions is a bit more elaborate,
and checks ferror/feof to accomodate EAGAIN - it might be useful
to replicate it here.

> +  if (size != hashlen)
> +    {
> +      fprintf (stderr, "Error from read (%zd vs %zd bytes): %s\n",
> +               size, hashlen, strerror (errno));
> +      ret = -EIO;

The fprintf might be problematic:
First, this is a cryptic error message, in the sense that
it indicate an I/O error (hinted by the word "read"),
but is actually a kernel crypto/driver issue.
If a user sees this message, there's no chance they'll know
what the problem is. It's better to be more explicit that
this relates to kernel crypto API problem.

Second, all other sha*_stream functions never output anything
to STDERR. Their interface is returning 0 for success, 1 for failure,
and setting errno for failure. Not sure if this is an official API
agreement or just de-facto, but it means that users of sha*_stream
functions should now be aware there might be output.

Lastly, if the decision is to keep the error message,
it's likely better to use gnulib's error() function, like so:

  error (0, errno, _("some error message %zd"), size);

Which will take care of strerror, and also enable translation.


On more general note, while the Crypto API is available since version 2.6.38,
There are still some issues popping here and there.
for example:

Another example, the libkcapi [1] library (referenced from the kernel crypto 
api documentation)
contains a work-around for a bug in kernels pre-4.9 [2].
I don't know if this is relevant for your implementation or not,
but if it is, it's worth checking for this and other issues.
  [1] http://www.chronox.de/libkcapi.html

It also seems OpenSSL version 1.1.0 added support for AF_ALG.
Perhaps it's better to leave it to them (since gnulib can already
use openssl easily)?

 - assaf

reply via email to

[Prev in Thread] Current Thread [Next in Thread]