Re: merged dfa?

From: Bruno Haible
Subject: Re: merged dfa?
Date: Wed, 1 Jul 2009 11:24:48 +0200
Hi Arnold,

Aharon Robbins asked relating to [1]:
> Do I remember correctly that you were going to try to merge the dfa.[ch]
> from grep, gawk, and gettext?  Did that go anywhere?

It didn't go very far. The diffs in gawk (50 KB of diffs) and grep (50 KB of
diffs as well) went into different directions. I could manage the cosmetic
and syntactic changes, and those that were ported between the two packages
just to be undone a bit later. But the major difference, that dfaexec in
gawk requires write access to the string being scanned, goes too deep. Even
for someone with a book about DFA/NFA theory in front of him, the comments
in the code are not sufficient for understanding what's going on in dfacomp
and dfaexec.

I gave up.

Probably what should be done in the long run, is:
  - For gettext, use the plain regex module - the use of regular expressions
    in msggrep is not speed critical.
  - For gawk and grep, either rewrite the thing from scratch (for example
    in a way that combines the DFA and kwset approaches instead of having
    them as separate data structures), or at least add enough comments that
    an average developer like me can understand what's going on.


[1]  http://lists.gnu.org/archive/html/bug-gnulib/2009-02/msg00009.html

