bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

af_alg benchmarks and performance


From: Bruno Haible
Subject: af_alg benchmarks and performance
Date: Tue, 08 May 2018 01:10:52 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-119-generic; KDE/5.18.0; x86_64; ; )

Hi all,

Thanks for your benchmarking help and explanations.

Let me try to summarize.

* We need to consider each of the algorithms md5, sha1 .... sha256 separately,
  because each algorithm has a different performance characteristic [1].
  This is due to the following factors:
    - Some non-Intel hardware has crypto devices. [2]
    - Intel hardware has special instructions for special crypto algorithms. 
[3][4]
    - The Linux kernel has specially optimized code for specific crypto
      algorithms. [4]

* For the afalg_stream case (with regular files), for all algorithms,
  kernel crypto is faster than user-space crypto, for sizes N > N_0.
  Reasons:
    1. The sendfile call avoids copying the file data to user-space.
    2. The in-kernel crypto code _may_ (or may not) be faster than the
       plain C code from gnulib.

* For the afalg_buffer case (and, btw, also the afalg_stream case with
  non-regular files), it depends on the algorithm and CPU capabilities:
  * If the in-kernel crypto code has roughly the same speed as the plain
    C code from gnulib,
    then we observe that kernel crypto is always slower than user-space crypto,
    because of the added overhead of copying the data to kernel space.
  * If the in-kernel crypto code is faster than the plain C code from gnulib
    by at least, say, 10%,
    then kernel crypto is faster than user-space crypto, for sizes N > N_0,
    because the faster algorithm outweighs the copying the data to kernel space.

* The reasons for our disappointment are:
  - The original presentation [2] was misleading because, as Assaf noticed [5],
    a large portion of the reported speedup (at least for Intel processors)
    is due to a test case that
      1. is a corner case,
      2. exhibits a speedup that is due to sendfile(), not a different crypto
         implementation.
    Lesson to be learned: When you present a new feature and motivate it with
    speedups, please always also include an _average_ use case (i.e. non-sparse
    files, or memory regions not completely filled with zeroes)!
  - We all have access to machines with x86_64 CPUs, and only some of them have
    special crypto instructions.
  - The system calls have some cost. [6]

Bruno

[1] https://lists.gnu.org/archive/html/bug-gnulib/2018-05/msg00043.html
[2] https://lists.gnu.org/archive/html/bug-gnulib/2018-04/msg00062.html
[3] https://en.wikipedia.org/wiki/AES_instruction_set
[4] https://lists.gnu.org/archive/html/bug-gnulib/2018-05/msg00038.html
[5] https://lists.gnu.org/archive/html/bug-gnulib/2018-04/msg00088.html
[6] https://lists.gnu.org/archive/html/bug-gnulib/2018-05/msg00044.html




reply via email to

[Prev in Thread] Current Thread [Next in Thread]