bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Yet another updated entropy patch


From: Marcus Brinkmann
Subject: Re: Yet another updated entropy patch
Date: Sun, 15 Jul 2007 22:53:02 +0200
User-agent: Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.8 (Shijō) APEL/10.6 Emacs/23.0.0 (i486-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)

At Fri, 13 Jul 2007 22:50:00 -0400,
Michael Casadevall <sonicmctails@gmail.com> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> (this is a response to both emails about said patch)
> 
> Marcus: I don't think I've ever been nitpicked that badly in my  
> life.  Congratulations!

I guess it's tough love :) :) :)

> On Jul 13, 2007, at 4:30 AM, Marcus Brinkmann wrote:
> >>      AM_CONDITIONAL([enable_kmsg], [false])
> >
> > I was going to suggest uppercase for automake conditionals, but you
> > can refer to prior art, so nevermind :)
> >
> >> ! 
> >
> > Where have all your page breaks gone?
> >
> 
> What page breaks?

In my editor, the diff shows that you removed a control L character,
printed as "^L", which is a form feed or page break character.  If you
print it on a 9-inch matrix printer, it will flush the current page.
In the source we use it to seperate major parts in the same file, for
example the header with includes from global variables and forward
declarations and logically distint interfaces (for example helper
functions and public functions).
 
> >> ! 
> >
> >> + #ifdef MACH_ENTROPY
> >> +      // Lets grab the cyclinder numbers for entropy
> 
> > Typo "cyclinder" -> "cylinder" and "Lets" -> "Let's" or even "We".  Do
> > not use C++ style comments.  End your sentences with a period. <- The
> > little thing at the end.  :)
> >
> 
> If you were rating Mach on being grammatically correct, you'd be in  
> for the
> shock of your life. Periods are the exceptions, not the rule (I did  
> however add it
> to suit thy master, and I did correct the typos; if you know of a C/C+ 
> + aware spell
> checker that works with vim, I'd use it in a heartbeat)

The bunch of Mach was not written by us, and we can always strive to
make it better.  What's vim?[1]

[1] Just kidding.

> >> +      entropy_putdata(prev, sizeof(io_req_t));
> >
> > Put a spacef before an open parenthesis.  This should be:
> >
> >         entropy_putdata (prev, sizeof (io_req_t));
> >
> > Read the GNU coding style standards as well, if you haven't already.
> >
> 
> Changed. I have read the GNU coding style, and quite honestly, it  
> just makes
> code look very ugly to me. I prefer classic K&R with tabs; much  
> cleaner and
> easier code to read without taking up too much screen real estate.

Ok, I can totally understand that.  Every coder who works long on a
uniform code base will train his brain to like what he uses while
every diversion makes him uncomfortable.  That's just part of how our
brains function, and it is actually useful as long as you don't need
to work on a code base using a different style.

Having a uniform code base that makes one group of hackers happy and
another group of hackers uncomfortable is unfortunate.  However, using
mixed styles and thus making everybody uncomfortable is a gross mistake.

You could object that GNU Mach is already a horrible mess.  However,
that's only because there are overriding concerns: It is inherited
code, and being able to trace the changes to that code base properly
is more important than convenience.  However, tracing changes through
gratuitous formatting style changes is very hard.

> >> ! #ifdef MACH_KERNEL
> >> ! #ifdef MACH_ENTROPY
> >
> > Maybe
> >
> > #if defined (MACH_KERNEL) && defined (MACH_ENTROPY)
> >
> 
> Changed. I always though #if changed() was a non-standard extension
> hence why I avoided using it.

Mmh, I don't know.  Anyway, we require GCC so it's not worth sweating
over.
 
> > to keep it on a single line?
> >
> >> !          return ((*cn_tab->cn_getc)(cn_tab->cn_dev, 1));
> >
> > Should be
> >             return ((*cn_tab->cn_getc) (cn_tab->cn_dev, 1));
> >
> > space-wise, but don't change (or pay attention to) existing code like
> > this:
> >
> >>                    return ((*romgetc)(1));
> >
> >
> 
> Changed to tabs. (you really love to nitpick, don't you :-P)

Sorry, I didn't mean the tabs, but the white space between the
parenthesis.
 
> >> + #define entropyname              "entropy"
> >> + extern int       entropyopen(), entropyclose(), entropyread(),
> >> entropygetstat();
> >
> > Wrap your lines at 78 characters.  Use software which doesn't wrap
> > overlong lines for you when you cut&paste it in emails.
> >
> 
> I'll switch back to using Pine with fetchmail then, it's just it  
> doesn't have native support for GPG signing messages.

Actually, what if you attach the file rather than inlining it?  That
could work just as well.

> >> +  /***
> >> +   * Kirkeby put the get entropy function here, so will I
> >> +   * the OR causes it to only get a some of the keypresses
> >> +   * making this even more random when it fires (he got this
> >> +   * from Linux originally
> >> +   ***/
> >
> > Drop the ASCII art (***), end sentences with a period and match
> > parenthesis (you are missing this one ->).  There should be a comma
> > before "making".  It would be nice to let the reader know who Kirkeby
> > is and where you got it from.
> >
> 
> You call it art, I call it readability. I changed it to this:

> /* Sune Kirkeby's entropy patch (which was a port of the
>   * linux entropy drivers for GNU mach) placed the keyboard
>   * entropy source here. I looked at that for an idea of
>   * where how to do write this driver. */

Drop the superfluous * characters.  I don't consider it more readable,
but even it is more readable, there is an overriding concern: You are
formatting the text across two dimensions, while it should be
formatted as a one dimensional text.  Two examples:

1) A screen reader will read your text aloud as: "... a port of the
ASTERISK linux entropy ..."

2) If you edit the text, you have to edit the *'s as well.  Actually,
emacs is smart enough to do this for you in most circumstances, but
anyway, it's just wrong.

> >> +  entropy_putchar((buttonchanges + moved.mm_deltaX
> >> +                  + moved.mm_deltaY) | 0122);
> >
> > This is one of the most important lines in the whole patch for
> > graphical desktop systems.  The mouse is one of the higher quality
> > entropy generators in the system.  What's the OR 0122 for here?
> >
> 
> The OR 122 is from how Linux does it. Now that I actually sit down  
> and think about it,
> especially with the addition of twisting code, it doesn't really make  
> too much sense to
> have it there, so its gone!

They surely had some reason to put it in.  It would be good to
understand it before changing it.
 
> As for entropy_init, it SHOULD be called by mach when it sets up its  
> own internal devices, or the first time the driver is opened (see  
> i386/i386at/conf.c). That being said, mach makes entropic output  
> (such as from the console device) before the initialization happens  
> so both putchar and putdata check and then initialize the driver. If  
> you want to recommend a good spot to put entropyinit so I don't need  
> this, I'd be glad to do it)

Mmh, I see.  Do we have static lock initializers in Mach?  If yes, you
could use them.  If not, we might be able to add them.  If that is not
feasible, you might just leave it as it is and put in the NCPUS
compile time check.

> > This is a shame!  You shall not waste perfectly good entropy like
> > this.  In fact, you just convinced me that we should do pool mixing in
> > the kernel.  (But this is not the only reason).
> >
> >> +   // Advance the pointer for the next write
> >> +   entropy_write_offset += 1;
> >
> 
> I had it like this originally. At least when I sample I/O, if the  
> device is constantly
> getting entropy, it actually slows down the system by at least 5% to  
> 10% percent.

What does Linux do?

> Other things like keyboard and mouse, which aren't as time sensitive  
> as I/O aren't effected.
> A potential solution is below
> 
> > Now for the more serious stuff: All entropy ain't equal.  Entropy from
> > the network, keyboard, hard drive and mouse have very different
> > quality.  There is no perfect way to estimate the amount of entropy
> > from any of these sources, but I will give you a quick example why
> > this is a problem.  Consider keyboard input.  Typically, we use maybe
> > 70-80 characters, and not all equally often.  However, you store these
> > in 8 bit which can hold 256 distinct values.  Clearly the 256 distinct
> > possible values are not going to be evenly distributed.  Some will
> > never occur, some will occur more often than others.  Network packets
> > are similar, and mouse data is also not perfectly random (in fact, you
> > OR a constant into the byte, and these bits will always be on no
> > matter what).  In any case, entropy is often over-rated.
> >
> > There are two things missing in your patch:
> >
> > 1. Entropy, when collected, must be weighted with a conservative
> > qualify *estimation*.  This is difficult to do, but the actual
> > estimations can be tweaked.  However, the infrastructure must be put
> > in place.  Linux for example separates the process of adding bytes to
> > the pool (__add_entropy_words) and crediting the pool with entropy
> > (credit_entropy_store).  An older version used to have a second
> > parameter along with the entropy byte, I think.
> >
> 
> That itself isn't that hard, I can simply add a new argument to the  
> end of putchar and putdata.
> I'm not really sure a good solution to this problem, but if we add  
> quality, what we could do is
> only sample from I/O sources to increase the general quality, and  
> have a configurable setting
> on when the buffer should top off so to speak.
> 
> For instance, lets say keyboard entropy is a 2, mouse entropy is a 4,  
> and I/O is a 7 (on a scale of 1-10). For the sake of
> simplicity, lets make our imaginary buffer 4 bites long, and lets  
> assume we have get four bites of entropy from the keyboard.
> Entropy Buffer: abcd
> Entropy Quality : 8 (2*4)
> 
> The entropy device should block if the quality is above a specific  
> point. At this point, since
> no more keyboard entropy will make the entropy any strong, it should  
> block

If you use pool mixing, adding data is never wrong.
 
> Now lets assume the user starts X11, and the system gets some data  
> from the mouse. The
> driver shouldn't block when getting mouse entropic data because it  
> will raise the quality
> 
> Entropy Buffer: 8231
> Entropy Quality: 16
> 
> At this point, the device should block taking in mouse or keyboard  
> entropy
> because it will not improve the entropy, but it still should accept  
> disk I/O
> 
> Now lets assume the machine needs to get VM, and swaps out
> 
> Entropy Buffer: (cylinder numbers)
> Entropy Quality: 28 (7*4)
> 
> Now the device would block completely because its at its max quality  
> point
> 
> Now doing this isn't THAT hard (although figuring how to do it in a  
> clean way that doesn't
> use loads of memory will be a trick). I'm open to implementation  
> ideas and tips.

Getting this stuff right is difficult enough to justify just copying
an existing solution verbatim.  Ie, using the pool mixer from Linux or
BSD or GnuPG.  It is very, very easy to compromise the quality of the
random data, which only purpose really is to provide cryptographically
strong randomness.

> > 2. The entropy pool then needs to be mixed, because the entropy is
> > only an estimate and fluctuates wildly.  Mixing evens out the
> > distribution.  Your current driver suggests a user-space mixing, but
> > then the user space mixer needs to know the entropy estimate!  This
> > could probably done by the control interface, but this starts to
> > become unwildly.  Furthermore, it is not a good idea to drop entropy
> > when the pool is full (you can't increase the entropy above the pool
> > size, but you can improve the quality of the pool by mixing more
> > entropy into it as a safety net, even when it is full).  Entropy is a
> > scarce resource, and we have such poor entropy sources that pushing
> > all data over to user space may be a performance issue in the current
> > implementation (consider that all network traffic needs to be copied
> > several times!).  It seems to make sense to me to mix in the kernel,
> > and again in user space (user space then implements pools with
> > different qualities, based on the high quality kernel entropy pool
> > data).
> >
> 
> Earlier versions of this driver did the mixing in kernel. After a few  
> rounds on
> the list, I took it out, and saved the code for implementation in the  
> translator.
>
> Readding it to the patch will be fairly easy, but I want to get  
> everything else
> figured out before I begin doing anything.

I think the main arguments in favor of in-kernel mixing are:

1) You don't waste any entropy (additional entropy will serve to
improve pool quality).

2) The entropy of PC random sources is generally extremely low, which
means that you have a poor raw source data to random data ration.
Mixing in the kernel drastically reduces data transfer quantity.

3) A marginal security argument: The internal status of the pool is
potentially more strongly hidden.

> > Do you add timer randomness?
> No, I found the timer really didn't add anything in terms of entropy -
> it was too predictable :-)

How did you measure it?  Linux seems it's good enough to justify the
effort (and remember apart from performance issues it never hurts to
add data to a mixed pool if the mixing strategy is right).

Thanks,
Marcus





reply via email to

[Prev in Thread] Current Thread [Next in Thread]