[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Nano-devel] WIP: Use caching for regex
From: |
Devin Hussey |
Subject: |
Re: [Nano-devel] WIP: Use caching for regex |
Date: |
Fri, 2 Nov 2018 23:39:30 -0400 |
Here is my patch, I am not going to duplicate the details.
https://savannah.gnu.org/patch/?9713
Anyways, gperf generates perfect hashes from the strings you give it.
Its syntax is similar to yacc/yylex.
https://www.gnu.org/software/gperf/
For a real example, instead of this O(n^2) strcmp tree:
if (!strcasecmp(input, "cancel"))
s->func = do_cancel;
else if (!strcasecmp(input, "exit"))
s->func = do_exit;
else {
s->func = do_toggle_void;
if (!strcasecmp(input, "nohelp"))
s->toggle = NO_HELP;
else if (!strcasecmp(input, "smoothscroll"))
s->toggle = SMOOTH_SCROLL;
}
we make this gperf file
%struct-type
%readonly-tables
%language=ANSI-C
%ignore-case
%{
#include "proto.h"
#include <string.h>
enum sc_type { SC_TYPE_SHORTCUT, SC_TYPE_TOGGLE };
%}
struct scfunc_data {
const char *data;
enum sc_type type;
union {
void (*func)(void);
int32_t toggle;
};
};
%%
# take advantage of how gperf puts things in verbatim
cancel, SC_TYPE_SHORTCUT, .func = do_func_cancel
exit, SC_TYPE_SHORTCUT, .func = do_func_exit
nohelp, SC_TYPE_TOGGLE, .toggle = NO_HELP
smoothscroll, SC_TYPE_TOGGLE, .toggle = SMOOTH_SCROLL
# make sure there are no extra empty lines at the end, and don't
indent it like I did
Run gperf on it and it spits out some magic C code which I am not
going to post because it is long, but it makes the function
const struct scfunc_data *in_word_set(const char *str, unsigned int len);
which converts the keys and values to a struct, quoting before the
first comma and putting things verbatim after it like so:
{ "cancel", SC_TYPE_SHORTCUT, .func = do_func_cancel }, {
"nohelp", SC_TYPE_TOGGLE, .toggle = NO_HELP }...
Now, in your code, do this:
sc *strtosc(const char *input)
{
const struct scfunc_data *data = in_word_set(input, strlen(input));
sc *s;
if (data == NULL)
return NULL;
s = (sc *)nmalloc(sizeof(sc));
if (sc->type == SC_TYPE_SHORTCUT) {
s->func = data->func;
s->toggle = 0;
} else {
s->func = do_toggle_void;
s->toggle = data->toggle;
}
return s;
}
And your unwieldy 262 line function just became 16.
gperf has very low complexity, because it knows what strings are going
to be hashed.
It does this:
1. Take the first and last few characters, as well as the length and
get a value from a look up table,
2. Check the length to another lookup table.
3. At this point, it has a unique value (it calculated it out
beforehand) and can just validate it with a single strcmp.
4. Return the entry in the table of structs, or null.
The only issue here is that there is no easy way to conditionally
enable the items AFAIK. cpp puts a blank line on replaced blocks, and
that doesn't play nice.
A side note: I think this could clean up convert_sequence as well:
O1;2A, KEY_UP, .shift = TRUE
O1;2B, KEY_DOWN, .shift = TRUE
O1;2C, KEY_RIGHT, .shift = TRUE
O1;2D, KEY_LEFT .shift = TRUE
O1;2P, KEY_F(13)
But, alas, that is for another patch.
- Re: [Nano-devel] WIP: Use caching for regex,
Devin Hussey <=